Next Article in Journal
A New Robust Algorithm for Fault-Plane Parameters Identification: The 2009 L’Aquila (Central Italy) Seismic Sequence Case
Previous Article in Journal
Research on Bearing Remaining Useful Life Prediction Method Based on Double Bidirectional Long Short-Term Memory
Previous Article in Special Issue
The Potential of AI-Powered Face Enhancement Technologies in Face-Driven Orthodontic Treatment Planning
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Analysis of Demographic, Familial, and Social Determinants of Smoking Behavior Using Machine Learning Methods

by
Joanna Chwał
1,2,3,
Małgorzata Kostka
4,
Paweł Stanisław Kostka
1,
Radosław Dzik
3,*,
Anna Filipowska
1 and
Rafał Jan Doniec
1
1
Department of Medical Informatics and Artificial Intelligence, Faculty of Biomedical Engineering, Silesian University of Technology, F. D. Roosevelta 40, 41-800 Zabrze, Poland
2
Joint Doctoral School, Silesian University of Technology, Akademicka 2A, 44-100 Gliwice, Poland
3
Department of Clinical Engineering, Academy of Silesia, Rolna 43, 40-555 Katowice, Poland
4
Centre for Diagnosis and Therapy “Famili”, pl. Kościuszki 1, 41-902 Bytom, Poland
*
Author to whom correspondence should be addressed.
Appl. Sci. 2025, 15(8), 4442; https://doi.org/10.3390/app15084442
Submission received: 22 February 2025 / Revised: 14 April 2025 / Accepted: 15 April 2025 / Published: 17 April 2025
(This article belongs to the Special Issue Artificial Intelligence in Medicine and Healthcare)

Abstract

:
Smoking behavior, encompassing both traditional tobacco and electronic cigarette use, is influenced by a range of demographic, familial, and social factors. This study examines the relationship between smoking habits and family dynamics through a cross-sectional survey of 100 participants, using an anonymous questionnaire to collect demographic data, smoking patterns, and familial interactions. Validated instruments, including the Penn State Electronic Cigarette Dependence Index and the Family Relationship Assessment Scale, were employed to assess smoking dependence and family dynamics. The analysis identified key patterns, such as increased smoking frequency among individuals experiencing higher family tension and variations in smoking habits across age and gender groups. Nocturnal smoking was linked to higher cigarette consumption, whereas early-day smokers exhibited a lower desire to quit. Machine learning models were applied to predict and classify smoking behaviors based on socio-demographic and familial variables, with an ensemble learning model achieving the highest accuracy (93.33%), outperforming k-nearest neighbors (90.00%), support vector machines (80.00%), and decision trees (83.33%). These findings underscore the complex interplay between family relationships and smoking behavior, providing insights for public health interventions. Additionally, this study highlights the potential of machine learning in behavioral research, demonstrating its utility in identifying and predicting smoking-related patterns.

1. Introduction

The family is the foundation of social life, providing the individual with a key environment for the formation of interpersonal skills, a value system, and a sense of identity. The family relationships that link family members at different stages of life serve an important function as a source of connection and social influence. Their importance for the individual is evident throughout life, playing a key role in building well-being and a sense of stability at each stage of their development [1]. The healthy development of the child and the functioning of the family depends on the existence of lasting, supportive and emotionally committed bonds between its members [2]. With supportive social relationships, people can cope better with stress and maintain their well-being. Good relationships can predict interpersonal functioning and mental health, as well as longevity [3]. On the other hand, a dysfunctional family is characterized by a lack of harmony or is fraught with tensions, such as conflicts between parents and children. Dysfunction occurs when one or more family members neglect their responsibilities, leading to family dysfunction and disintegration. This type of family is characterized by lower levels of health, well-being, happiness, and positive relationships compared to other families, which can contribute to social problems, but also generate more complex difficulties, causing suffering and distress for their members [4].

1.1. Physical and Prenatal Health Impacts

Smoking, particularly traditional tobacco use, has long been a well-documented public health concern, but its impact on family dynamics remains less explored. In their study, Steeger et al. show that adolescent externalizing problems, including oppositional and conduct issues, develop from parental smoking exposure. The predictive power of harsh parenting and low parent–child bonding exceeds parental smoking as factors over time [5]. University student research showed that students who experienced negative parental relationships and poor father–child bonds were more likely to smoke. Students who faced intense pressure to study showed increased smoking behavior, which may stem from their stress [6].
Families are the primary units of social interaction, and smoking behaviors within them can influence communication patterns, conflict resolution, and shared experiences. In households where smoking is prevalent, these behaviors may not only affect individual health but also disrupt family cohesion [7]. Non-smokers, especially children and elderly family members, may experience significant emotional and physical strain as they contend with health risks, secondhand smoke exposure, and concerns about their loved ones’ well-being. This strain can affect how family members communicate, navigate conflicts, and provide emotional support, underscoring the need to understand the social dimensions of smoking in familial contexts [8].
Smoking during pregnancy poses significant risks to the unborn child, including low birth weight, developmental delays, and an increased likelihood of respiratory issues later in life. Maternal smoking during pregnancy is associated with intrauterine growth restriction, leading to low birth weights. Prenatal tobacco smoke exposure is a well-known risk factor for adverse neurodevelopmental outcomes in childhood, including cognitive delays [9]. The primary effects of maternal smoking on offspring lung function and health include decreased respiratory compliance, increased hospitalization for respiratory infections, and an increased prevalence of childhood wheeze and asthma [10]. These outcomes not only affect the child’s health but can also strain family dynamics as parents navigate the challenges of caring for a child with health complications. Furthermore, the stress associated with smoking-related complications during pregnancy may heighten tension between partners, adding another layer of complexity to family relationships [7,11].
The physical and relational toll on non-smokers extends to children exposed to secondhand smoke, who face heightened risks of respiratory issues, asthma, and developmental challenges [7]. Non-smoking family members, including spouses and elderly relatives, often experience increased stress and concern for loved ones’ health, which can lead to tension within the household. The interplay of these physical and emotional effects highlights the profound impact smoking can have on the dynamics of family life, particularly when shared routines are disrupted or emotionally charged discussions arise over smoking behaviors [12,13]. Guo et al., in their interesting study, revealed the situation that poor family health and dynamics are associated with higher levels of nicotine dependence among smoking fathers. This suggests that dysfunctional family relationships can contribute to increased smoking behaviors, leading to a cycle of stress and tension within the household [14].

1.2. Social and Cultural Factors Shaping Smoking Behaviors

Cultural and socioeconomic factors further shape how smoking behaviors are adopted and maintained within families. In some cultures, smoking is deeply ingrained in social rituals or perceived as a sign of status, influencing how family members view and interact with smoking [15,16]. Gender dynamics also play a crucial role in shaping these behaviors. For instance, in many households, men are more likely to smoke than women, which can normalize smoking behaviors for children, especially boys [17]. Meanwhile, women, particularly mothers, may face societal pressure to quit smoking due to their caregiving roles, creating guilt or stress in households where cessation efforts fail [5,8].
Families in lower socioeconomic brackets often face additional challenges, including limited access to cessation resources or heightened stress that exacerbates smoking habits. These factors can create generational smoking patterns that are difficult to break, complicating efforts to improve family health and cohesion [18]. Financial strain is a significant mediator between low socioeconomic status and smoking behaviors. Individuals experiencing economic hardship often face chronic stress, which can lead to increased tobacco use as a coping mechanism. This stress-induced smoking not only affects the individual’s health but also influences family dynamics, potentially normalizing smoking behaviors for younger family members and perpetuating a cycle of tobacco use across generations [19]. Beyond parental influence, peer-like relationships within families, such as those between siblings or cousins, also play a significant role in shaping smoking behaviors. Younger family members often emulate their older siblings or cousins, complicating cessation efforts that focus solely on parental habits. Interventions that address family-wide behaviors are thus more likely to yield lasting results [11,13].

1.3. The Rise of E-Cigarettes and Technological Influence

The advent of electronic cigarettes, often marketed as less harmful alternatives to conventional tobacco products, has added significant complexity to the already multifaceted landscape of smoking behaviors. These devices have rapidly gained traction, particularly among younger demographics, due to perceptions of reduced harm and the appeal of integrating modern technology into smoking practices [11,20]. Social media campaigns and cessation apps play a dual role in this shift. Although these platforms help spread awareness of smoking risks and provide resources for quitting, they also glamorize e-cigarettes, particularly to younger audiences. Families must navigate these mixed messages, and younger tech-savvy members are often at the forefront of shaping household attitudes toward smoking and e-cigarettes [11,20].
Emerging research on e-cigarettes has also raised concerns about their potential role as a gateway to traditional tobacco use, particularly among youth and young adults. For instance, research replicating earlier studies found that baseline e-cigarette use among adolescents was linked to higher odds of tobacco smoking at 6-month and 12-month follow-ups [21]. Family environments play a significant role in shaping these behaviors, with younger family members often influenced by the habits and attitudes of those around them. E-cigarettes, initially viewed as a safer alternative, can inadvertently lead to greater family conflict if they result in transitioning to traditional smoking. Understanding these behavioral shifts and their impact on family relationships is essential for developing public health strategies that minimize harm [5,22]. In addition, family structure plays a role in adolescent smoking behaviors. Adolescents from non-intact families have been shown to have a higher prevalence of smoking and an earlier onset of cigarette use [23].

1.4. Family Dynamics and Smoking Cessation

Shared smoking habits within families can serve as both a bonding activity and a source of contention. While smoking together may provide moments of connection for some, disagreements about its health risks can lead to heightened conflict and negatively affect the overall family atmosphere. The practice of smoking together creates bonding moments for some family members, but health risk disagreements produce more conflict, which damages family atmosphere. Studies show that adolescent tobacco use relates to family conflict through sensation seeking and impulsive behavior patterns. Research by Eslava et al. indicates that family arguments and stress create conditions which lead younger family members to start smoking [24]. Moreover, research by Hill et al. has found that poor family relationships, such as low parental monitoring and bonding, are associated with higher risks of daily smoking initiation among adolescents. This underscores the importance of a supportive family environment in preventing tobacco use and suggests that conflicts over smoking can undermine such protective factors [25].
The rise of e-cigarettes has introduced new complexities to these interactions, as these devices are often perceived differently from traditional tobacco in terms of health risks and societal acceptance. These shifting perceptions further complicate family dynamics and decision-making around smoking behaviors [11,20,26].
Public health campaigns and smoking cessation programs have played a pivotal role in shaping family behaviors. As smoking rates have declined in many countries, households have increasingly become environments where non-smokers and former smokers can interact more comfortably. These interventions reduce familial conflict related to smoking and foster healthier living conditions for children and non-smokers. Campaigns that emphasize the collective health benefits of quitting, including the positive impacts on family relationships, have proven especially effective in encouraging smoking cessation within households [12].

1.5. Long-Term Effects on Family Cohesion

Smoking behaviors within families often have long-term, intergenerational effects, perpetuating cycles of tobacco use. Children raised in households where smoking is prevalent are more likely to view it as a normative behavior, increasing their likelihood of becoming smokers themselves. Data from the UK government reveal that teenagers whose parents or caregivers smoke are four times as likely to start smoking [27]. This intergenerational transmission underscores the profound impact of parental smoking on youth behavior. Additionally, research by Alves et al. indicates that parental smoking increases the likelihood of adolescent daily smoking, with maternal smoking having a stronger association for girls and paternal smoking for boys [28]. This cycle can also influence how future generations approach health, communication, and conflict resolution within families. Breaking this pattern requires interventions that address not only individual behaviors but also the familial and cultural contexts in which these behaviors are embedded [12,22].
By examining the dynamics within families affected by traditional tobacco use and e-cigarette consumption, researchers aim to illuminate the nuanced ways smoking behaviors influence communication patterns, conflict resolution, and emotional bonding [12]. As smoking rates continue to decline in many populations, the distinctions between smokers and non-smokers within families grow more pronounced, raising critical questions about their impact on family cohesion [13].
This research underscores the importance of considering both individual smoking behaviors and their collective implications within familial contexts to inform smoking prevention and support initiatives.

2. Materials and Methods

This study used a cross-sectional survey design to investigate the relationship between smoking behaviors, including traditional tobacco use and electronic cigarettes, and family dynamics. This research aimed to capture a comprehensive snapshot of participant smoking habits and familial interactions through structured data collection and advanced analytical methods. The stages of the research are shown in Figure 1.

2.1. Survey Instruments

The questionnaire for this study was designed to explore the relationship between smoking behaviors and family dynamics. It included three main sections: Demographic Information, Smoking Habits, and Family Dynamics, alongside validated tools like the Penn State Electronic Cigarette Dependence Index (PSECDI) and the Family Relationship Assessment Scale (FRAS). A literature review guided the inclusion of key variables, and Likert-scale items were extensively used for quantifying subjective experiences [29].
The Demographic Information section collected data on participants’ age, gender, education level, and living environment (rural, suburban, or urban) to analyze sociocultural and geographic influences. The Smoking Habits section assessed smoking type (traditional or electronic), frequency, age of initiation, and duration, enabling comparisons between the impact of tobacco and e-cigarettes on family relationships. The Family Dynamics section used Likert-scale items adapted from FRAS to evaluate communication, conflict resolution, shared activities, and emotional cohesion [30].
The PSECDI, a 10-item measure developed by Foulds et al. [26], was used to assess addiction levels for both traditional and electronic cigarettes, since the test items can also be modified to evaluate dependence on traditional cigarettes, using the Penn State Cigarette Dependence Index (PSCDI). It was administered as one of three components of the questionnaire: (1) author-designed items, (2) the PSECDI scale, and (3) the FRAC family relationship indicators. The PSECDI provides cumulative scores ranging from 0 to 20, categorized as no, low, moderate, or high dependence. The original scale has demonstrated convergent validity (correlation: 0.71 with the E-cigarette Dependence Scale) and construct validity, supported by the observed correlation between test scores and the nicotine concentration of the e-liquids consumed [31]. The PSEDCI and FRAS questionnaires were translated using the back-translation method, as no validated Polish versions of these instruments were available. This approach was employed to ensure linguistic and conceptual equivalence between the original and translated versions.
The FRAS assessed family dynamics across three subscales: Family Support, Family Conflicts, and Family Togetherness. Items rated on a 5-point Likert scale provided a comprehensive view of family relationships, with a Cronbach’s alpha of 0.89 for the overall scale and 0.77–0.87 for the subscales [30].
E-cigarette and traditional cigarette use were recorded as binary variables (yes/no) to assess general usage prevalence. Additional items assessed usage frequency and dependence levels using Likert scales and the PSECDI score. Correlation analyses involving binary smoking use variables were interpreted cautiously due to scale limitations.
Additional self-administered questions captured perceptions of smoking-related family issues, including conflicts, relationship impacts, and quitting attempts, offering further insights into the interplay between smoking behaviors and family dynamics.

2.2. Participant Recruitment

Participants for this study were recruited using a convenience sampling strategy through online platforms, community forums, email invitations, and local networks to ensure accessibility and diversity. Eligibility criteria included being 18 years or older, providing informed consent, and proficiency in Polish to complete the survey. Exclusion criteria encompassed individuals under 18, those who declined consent, or were unable to complete the questionnaire. The final sample of 100 participants represented a range of age groups, gender identities, educational backgrounds, and living environments (urban, suburban, rural).
Ethical standards were rigorously upheld throughout the study. Participants were provided with detailed information about the study’s purpose and assured of anonymity. Electronic informed consent was obtained, and participants were reminded of their right to withdraw at any point. The survey collected no personally identifiable information, ensuring privacy and compliance with ethical guidelines. This study was conducted with the approval of the Bioethical Committee of Medical University of Silesia in Katowice, Poland, dated 16 October 2018, approval number KNW/0022/KB1/79/18.

2.3. Data Collection

Data collection was conducted using Google Forms, selected for its accessibility, compatibility across devices, and ease of use. The survey link was distributed via social media, community groups, and email lists to ensure broad outreach and inclusivity. This digital format minimized human error and allowed participants to complete the survey conveniently on various devices [26].
The collected data were examined through a combination of statistical tests and machine learning techniques to identify significant associations and patterns. Descriptive statistics provided a foundational understanding of the sample’s characteristics, while statistical tests were employed to examine potential relationships between variables. Machine learning algorithms were then used to predict smoking behaviors based on the identified factors.

2.4. Statistical Analysis

The statistical analysis was conducted to investigate the relationships between socio-demographic, familial, and behavioral factors influencing smoking behaviors, cessation attempts, and related variables. The dataset included categorical, ordinal, and continuous variables, which were preprocessed and categorized as needed. For example, for variables such as age, participants were grouped into three age categories—18–22, 23–27, and 28–39 years—for ANOVA analysis, based on tertile distribution. Chi-square tests were employed to evaluate associations between categorical variables (e.g., gender and type of smoking product). Spearman’s rank correlation was used to assess relationships involving ordinal or non-normally distributed variables (e.g., Likert-scale ratings and initiation age). One-way ANOVA was applied to compare group means for continuous variables that met the assumptions of normality and homogeneity of variance (e.g., PSECDI scores). Assumptions for ANOVA were verified using the Shapiro–Wilk test for normality and Levene’s test for equality of variances. Statistical significance was set at a p-value of 0.05. All analyses were conducted in Matlab R2019b for academic use (MathWorks Inc., Natick, MA, USA).

2.5. Machine Learning Methods

Machine learning techniques were employed to classify and predict smoking behaviors based on socio-demographic, familial, and behavioral factors. The choice of algorithms was guided by their compatibility with the dataset’s size and feature characteristics. The decision tree was selected as a baseline model due to its interpretability and ability to handle nonlinear relationships effectively. The ensemble method was employed to enhance robustness and mitigate overfitting by aggregating multiple weak learners, a strategy well suited for managing the variability inherent in the dataset. Support vector machines (SVMs) were chosen for their theoretical advantage in separating classes by maximizing decision margins, although their practical performance was constrained by the limited dataset size. Finally, the k-NN algorithm was implemented for its simplicity and strength in capturing localized relationships, making it an effective tool for identifying proximity-based patterns within the data.
To better understand the structure of the data and assess the complexity of the classification task, a Principal Component Analysis (PCA) was conducted. The initial PCA was performed using all available features from the demographic, FRAC, and PSECDI sections of the questionnaire. In addition, a refined PCA was carried out on the six most statistically discriminative variables, identified via ANOVA F-test, to enhance interpretability. These projections aimed to explore class separability and identify meaningful clustering patterns within the dataset.
The preprocessing and analysis of the dataset were designed to facilitate robust machine learning modeling while addressing data quality issues and extracting meaningful insights. Initially, the data were subjected to preprocessing steps, including the removal of rows with missing values to maintain the integrity of the dataset. Continuous variables, such as smoking frequency and family support ratings, were normalized using z-score normalization to ensure compatibility with machine learning algorithms and to avoid biases due to differing scales. In addition, categorical variables, including family smoking history, conflict levels, and type of smoking, were encoded in numerical formats. Feature engineering was applied to construct composite variables that represent the impact of smoking on family relationships and behaviors based on responses to the Likert scale. The class imbalance was addressed by oversampling the minority class, particularly for variables with significant disparity in the response distribution. The data were then partitioned into training (70%) and testing subsets (30%).
Several algorithms were implemented, including a decision tree classifier, ensemble learning methods (using a bagging approach), k-nearest neighbors (k-NNs), and support vector machines (SVMs). Each model was trained and validated using stratified k-fold cross-validation to minimize overfitting and assess generalizability. Metrics such as accuracy and loss were used to evaluate model performance, and confusion matrices were generated to analyze classification consistency. All machine learning analyses were performed using Matlab R2019b for academic use (Mathworks Inc., Natick, MA, USA) and machine learning tools, ensuring a reproducible and systematic approach to analysis.
The classification task was aimed to predict how participants perceived the impact of their smoking behavior on family relationships. The target variable was derived from a survey item with four ordinal response categories, 0.00, 0.33, 0.66, and 1.00, reflecting increasing levels of perceived negative impact on family dynamics. These values served as discrete class labels for supervised learning. Out of the total sample of 100 participants, the distribution of classes was imbalanced, with the majority of responses concentrated in the 0.33 and 0.66 categories. To address this imbalance and ensure fair model training, random oversampling of the minority classes was applied during preprocessing. The models were trained on a set of 9 to 12 features, selected from all three sections of the questionnaire: (1) original demographic and behavioral items, (2) FRAC subscales (Support, Conflict, Togetherness), and (3) PSECDI-derived items related to smoking habits and dependence. The selected features included variables such as age, gender, type, and frequency of smoking, presence of smokers in the family, conflict levels, perceived family acceptance, and time to first cigarette. Given the relatively small sample size (N = 100) compared to the number of input features, care was taken to mitigate potential overfitting and dimensionality issues. This was achieved through feature selection, model regularization, and cross-validation techniques during training and evaluation.

3. Results

3.1. Statistical Analysis

Understanding the demographic and behavioral characteristics of survey respondents is crucial for interpreting patterns in their responses. This section explores key variables through statistical distributions, aiming to contextualize the broader study. These variables include gender, age, living environment, educational attainment, and cigarette use preferences (Table 1).
The dataset analyzed in this chapter was derived from a structured survey. Key columns were selected for relevance, including gender, age, living environment, educational level, and type of cigarettes used. Gender was classified as male, female, or other, while age was reported in years. Living environments were grouped into cities over 500,000 residents, cities 150,000–500,000 residents, cities 50,000–150,000 residents, cities up to 50,000 residents, and rural areas. Educational levels were categorized as university, secondary, vocational, lower secondary or primary. Cigarette use was divided into three categories: electronic, traditional, or both.
After collecting 100 completed surveys, the following results were obtained. The age distribution of respondents was primarily concentrated in the early twenties, with the majority being under 30 years of age, forming a unimodal pattern. Regarding the living environment, a significant proportion of participants reported residing in large urban areas, followed by those living in medium-sized cities. Educational attainment skewed toward higher levels, with most respondents reporting secondary or university education, while vocational and primary education were less represented. As for smoking preferences, electronic cigarettes emerged as the most commonly used product, followed by a substantial number of individuals who used both electronic and traditional cigarettes. Exclusive use of traditional cigarettes was the least common among the sample.
The Family Relationship Assessment Scale (FRAC) (Table 2) provides valuable insights into familial dynamics by evaluating family support, family conflicts, and family togetherness. Each FRAC subscale (Family Support, Family Conflict, Family Togetherness) is based on a 5-point Likert scale, where 1 indicates the lowest and 5 the highest level of the measured dimension. In this study, all scores are reported as group means within the valid range of 1–5. This analysis investigates these dimensions among individuals with different smoking behaviors: electronic cigarette (e-cigarette) users, traditional cigarette smokers, and both types of cigarettes smokers.
E-cigarette users reported the highest levels of family support, with an average score of 4.22. This suggests that they perceive strong support within their family environment. Conversely, traditional cigarette smokers reported the lowest level of support, averaging 3.50, indicating weaker familial bonds or a sense of reduced support. Dual users, who consume both e-cigarettes and traditional cigarettes, fell in between with a support score of 3.67, reflecting moderate family support but potentially influenced by the complexity of their smoking behaviors.
In terms of family conflicts, the interpretation of scores is inverted—lower scores indicate fewer conflicts. E-cigarette users had the lowest conflict score, averaging 1.81, which reflects minimal tensions within their family dynamics. Dual users also reported relatively low conflict levels, with a score of 1.89, while traditional cigarette smokers had the highest level of conflict among the groups, scoring 2.00. These results suggest that families are more accepting of e-cigarette use, potentially due to its perceived reduced risks compared to traditional smoking. Meanwhile, the higher conflict scores among traditional cigarette smokers may stem from the stronger stigma and health concerns associated with conventional smoking.
The dimension of family togetherness revealed additional differences. E-cigarette users reported the highest levels of togetherness, averaging 3.26, which underscores a greater engagement with family activities. Traditional cigarette smokers scored slightly lower at 3.12, while dual users reported the least amount of shared time, with an average score of 2.83.
The results indicate that social actions need improvement through educational interventions, which tackle how families understand e-cigarettes differently from traditional cigarettes. The perceived lower risk of e-cigarettes explains why families show more acceptance toward their use so educational campaigns should deliver accurate scientific information about both products’ actual dangers. The programs should create opportunities for families to discuss nicotine use without confrontation to minimize conflict and reduce stigma. When families focus on mutual health protection goals, they might transform accusatory dialogues into supportive exchanges that lead to better cessation and prevention practices.
The Penn State Electronic Cigarette Dependence Index (PSECDI) (Table 2) is a validated tool designed to assess the degree of dependence among users of electronic cigarettes, which ranges from 0 to 20, and was used to evaluate nicotine dependence among e-cigarette users. Dependence levels were categorized as follows: no (0), low (1–4), moderate (5–8) and high (≥9) dependence. Its adaptation for the current study allows for comparative analysis of dependence levels across three groups: e-cigarette smokers, traditional cigarette smokers, and both types of cigarettes smokers.
PSECDI results are presented both as percentages of respondents falling into each dependence category and as a mean score for each group, reflecting the average level of nicotine dependence. This dual presentation enables both categorical comparison and identification of trends in severity within each smoking group.
The levels of dependence, categorized as no dependence, low dependence, moderate dependence, and high dependence, varied significantly between the groups. Among e-cigarette users, 47.37% fell into the high-dependence category, while 31.58% exhibited moderate dependence, and 21.05% reported low dependence. Notably, no respondents in this group reported an absence of dependence. For traditional cigarette smokers, the distribution was more diverse. While 42.86% demonstrated high dependence, 33.33% exhibited low dependence, and 19.05% fell into the moderate dependence category. A small proportion of this group (4.76%) reported no dependence, reflecting some variability in addiction levels within this cohort. Individuals who smoke both e-cigarettes and traditional cigarettes showed the highest prevalence of dependence, with 50.00% classified as highly dependent and 33.33% moderately dependent. Like e-cigarette users, no respondents in this group reported an absence of dependence, while 16.67% demonstrated low dependence. The average PSECDI scores further illustrate the differences in dependence levels among the groups. Dual users had the highest mean score of 13.50, indicating the most severe dependence on nicotine products. E-cigarette users followed closely with an average score of 13.11, suggesting a significant level of addiction within this group. Traditional cigarette smokers exhibited the lowest mean PSECDI score at 12.05, although this still reflects a considerable degree of dependence.
Results showed significant differences in family support (p = 0.021) and family togetherness (p = 0.043) among smoking groups. Post hoc comparisons indicated that e-cigarette users reported significantly higher family support than traditional smokers, and significantly greater family togetherness than dual users. No significant differences were observed between dual users and traditional smokers in either dimension.
PSECDI scores were highest among dual users (mean = 13.50), followed closely by e-cigarette users (13.11). Traditional smokers had the lowest average dependence (12.05). A one-way ANOVA test confirmed a statistically significant difference in dependence scores between e-cigarette users and traditional cigarette smokers (p = 0.037). No significant differences were observed between dual users and the other two groups.
This study examined the influence of socio-demographic, familial, and behavioral factors on smoking behaviors and cessation attempts, yielding critical insights into underlying patterns and associations (Table 3).
Age was found to have a statistically significant relationship with the type of smoking product used (F = 8.787, p = 0.0003). Additionally, age correlated positively with the age of smoking initiation ( ρ = 0.22546, p = 0.0241), indicating that older participants tended to start smoking later in life. Gender was also significantly associated with the type of smoking product ( χ 2 = 10.63, p = 0.031). In contrast, education level ( χ 2 = 6.41, p = 0.60111), place of residence ( χ 2 = 7.02, p = 0.53404), and a family history of smoking ( χ 2 = 2, p = 0.36788) showed no significant associations with the type of smoking product used.
Behavioral patterns demonstrated critical findings. Nocturnal smoking behavior was significantly correlated with higher smoking frequency ( χ 2 = 14.2456, p = 0.014122). A strong positive Spearman correlation was observed between e-cigarette use and prior conventional smoking ( ρ = 0.67921, p = 8.0027 × 10−15), emphasizing a behavioral linkage between these habits. The time to the first cigarette after waking was negatively correlated with willingness to quit smoking ( ρ = −0.30509, p = 0.002), indicating that individuals who smoked earlier in the day were less likely to exhibit cessation intent. However, smoking frequency itself showed no significant correlation with the willingness to quit ( ρ = −0.07068, p = 0.4847).
Familial dynamics were found to play a significant role in smoking behaviors. Family tension correlated positively with the number of cigarettes smoked ( ρ = 0.22 to 0.34, p-values ranging from 0.029 to 0.001). However, the impact of smoking on relationships ( ρ = −0.15043, p = 0.1352), family conflicts ( χ 2 = 5.15, p = 0.27194), family support (coefficients: 1.37, −0.18, p = 0.0583, p = 0.3000), and family acceptance ( ρ = −0.088 to −0.076, p-values ranging from 0.334 to 0.467) did not show statistically significant relationships with smoking behaviors. In Table 3, we present the influence factors on smoking behavior analysis.
The results in Table 3 demonstrate the diverse influences of socio-demographic, behavioral, and familial factors on smoking behaviors. For example, the analysis of variance (ANOVA) revealed significant differences in smoking product preferences across different age groups (F = 8.787, p = 0.0003), suggesting that age influences the type of smoking products individuals use. This may reflect generational trends, with younger participants possibly favoring e-cigarettes due to cultural perceptions or accessibility. Conversely, older individuals may be more inclined to use traditional tobacco products. In contrast, education level was not significantly associated with the type of smoking product used ( χ 2 = 6.41, p = 0.6011), indicating that smoking habits in this sample appear to transcend educational boundaries.
A chi-square test revealed a statistically significant association between current e-cigarette use and prior use of traditional cigarettes ( χ 2 = 8.30, p = 0.00396), suggesting that many individuals transitioned from conventional smoking to electronic nicotine delivery systems. This relationship, captured by the variable “prior use of traditional cigarettes”, highlights a behavioral continuum between the two forms of smoking. Additionally, a moderate negative correlation was found between the time to the first cigarette after waking and willingness to quit smoking ( ρ = −0.30509, p = 0.002), implying that individuals who smoke shortly after waking may be more addicted and less likely to consider cessation.
Interestingly, family acceptance showed no significant correlation with smoking frequency ( ρ = −0.088 to −0.076, p > 0.05), suggesting that perceived family approval or disapproval may not directly influence how often participants smoke. These findings reinforce the importance of designing interventions that reflect the specific behavioral patterns and psychosocial predictors identified through analysis.

3.2. Machine Learning Analysis

This study utilized machine learning methodologies to explore the associations between smoking behaviors and familial dynamics, leveraging a dataset with numerically encoded responses. Key variables included smoking frequency, family smoking history, conflicts arising from smoking, perceived relational impacts, and levels of familial acceptance. The comparative performance of different models provided insights into the predictive value of these variables and the efficacy of advanced analytical approaches.
To assess class separability and identify structure in the feature space, a Principal Component Analysis was conducted. In the initial projection using all available variables (demographic, FRAC, and PSECDI), class overlap was substantial, suggesting high complexity and limited separability. To improve interpretability, a second PCA was performed using the six most discriminative features, selected via ANOVA F-test. As shown in Figure 2, this reduced-dimensional projection revealed improved clustering of participants based on their perceived impact of smoking on family relationships. Some degree of class separation, especially for extreme values, supports the existence of meaningful patterns and justifies the use of supervised learning methods.
In the data preprocessing phase for machine learning (ML) applications, it is crucial to visualize and inspect the data to ensure its integrity and suitability for modeling. The written code implemented two sample visualizations to assess the quality and interpretability of the dataset before using it in ML algorithms. Initially, categorical responses regarding the perceived impact of smoking on interpersonal relationships were converted to numeric values using ‘containers.Map’ in MATLAB, where predefined text responses were mapped to corresponding numeric values. The question that elicited these responses was “What is your perception of the impact of your smoking on family relationships?”. We mapped the responses as follows: “I believe it clearly harms our relationships” was mapped to 1, indicating a strong negative impact; “I think it has a positive effect” was mapped to 0, indicating a positive effect; “It has a certain negative impact, but not large” was mapped to 0.66, representing a mild negative impact; and “I don’t notice any impact” was mapped to 0.33, indicating a minimal or no impact. A scatter plot (Figure 3) was then generated to examine the relationship between the age at respective person started smoking and his/her perceived impact on relationships. This visualization simplified the identification of trends or potential correlations between variables of interest. The second visualization uses a histogram (Figure 4) to illustrate the distribution of smoking frequency among participants. This allowed for a detailed examination of the shape of the distribution, central tendency, and spread, which could inform decisions about scaling or transforming the function. Taken together, these graphs serve as an exploratory tool for detecting issues such as outliers, data skewness or inconsistencies, thus ensuring that the data are properly prepared before entering it into learning models.
Hyperparameter adjustment was critical in improving the performance of the machine learning models deployed. The ensemble model’s parameters, such as the number of trees in the forest and the maximum depth of each tree, were carefully improved using a grid search method. This approach entailed analyzing various parameter combinations using cross-validation, allowing for the selection of those that minimized error while increasing model generalizability. Similarly, the k-NN model was tuned by changing the number of neighbors (k) and using cross-validation to find the best balance of underfitting and overfitting. The best performance was obtained when k was adjusted to 5, which adequately captured the underlying data patterns without becoming too complicated. In this context, several key features were used in the models:
  • NumericImpactResponses—questions assessing the impact on family relationships;
  • NumericConflictResponses—questions about family conflicts;
  • NumericAcceptanceResponses—questions about family acceptance of smoking;
  • NumericFamilySmoking—question whether anyone in the family smokes;
  • CleanedSmokingFrequency—represents the individual’s smoking frequency.
In this study, a decision tree model was developed to classify data based on two primary features: NumericImpactResponses and NumericConflictResponses. Although the algorithm initially had access to nine features, it automatically selected these two as the most relevant for predicting outcomes. The structure of the decision tree is hierarchical (Figure 5), with the root node splitting the dataset based on the value of NumericImpactResponses, applying a threshold of 0.495. Instances with NumericImpactResponses values below this threshold are directed to the left child node, while those with values equal to or exceeding 0.495 are directed to the right child node. This primary division reflects the perceived impact of smoking on relationships, as measured by NumericImpactResponses, and serves as the foundation for subsequent classifications.
At the second level of the tree, the child nodes resulting from the initial split are further divided based on the value of NumericConflictResponses, with a threshold of 0.625. The terminal nodes correspond to a unique combination of values for NumericImpactResponses and NumericConflictResponses:
  • When NumericImpactResponses is less than 0.495, the model predicts an outcome of 0.33.
  • When NumericImpactResponses is greater than or equal to 0.495 and NumericConflictResponses is equal to or above 0.625, the predicted outcome is 1.
  • When NumericImpactResponses is greater than or equal to 0.495 and NumericConflictResponses is below 0.625, the predicted outcome is 0.66.
Beyond the structural insights, the importance of the selected features was analyzed to understand their relative contribution to the model’s performance. The results demonstrate that for the decision tree (Figure 6), NumericImpactResponses is far more influential, with an importance score of 0.1875, compared to NumericConflictResponses, which has a score of 0.03807. This finding suggests that the perceived impact of smoking on relationships (NumericImpactResponses) is the dominant driver of decision-making within the model, while conflict-related considerations (NumericConflictResponses) play a secondary role.
The feature importance analysis is particularly significant because it underscores the model’s capacity to discern and prioritize the most predictive features from a broader set of variables. The substantial difference in importance scores indicates that NumericImpactResponses is central to understanding the relationships in the data, aligning with its role as the root node’s splitting criterion.
The ensemble method aggregates feature importance (Figure 7 across multiple base learners (e.g., decision trees) to provide a robust measure of feature contribution. In this case, the most influential feature is NumericImpactResponses, with a score of 0.1459, far exceeding the contributions of other features. Other features, such as Gender (0.0085), CleanedSmokingFrequency (0.0023), and When did you start smoking (0.0031), contribute marginally, while several features exhibit no measurable importance.
This sparse distribution of importance scores is characteristic of ensemble methods when dealing with datasets where only a subset of features holds predictive power. The dominance of NumericImpactResponses underscores its centrality to the predictive mechanism of the ensemble model, likely reflecting a strong and consistent relationship with the target variable.
The feature importance for the SVM model (Figure 8) was determined using the permutation method, which assesses the impact of randomly shuffling a feature on model performance. Here, NumericImpactResponses emerges as the dominant feature, with an importance score of 0.1867, followed by NumericFamilySmoking (0.0133), Age (0.0167), and When did you start smoking (0.0067). The remaining features have negligible or zero importance.
This result suggests that the SVM model, which relies on maximizing the margin between classes, identifies NumericImpactResponses as critical to defining the decision boundary. The relative importance of NumericFamilySmoking, Age, and When did you start smoking highlights secondary relationships that contribute to the model’s performance, albeit at a smaller scale.
Feature importance for the k-NN model (Figure 9), also derived using the permutation method, reveals NumericImpactResponses as the most significant contributor, with an importance score of 0.2433. This far exceeds the contribution of NumericConflictResponses (0.008) and other features, which have zero measurable importance.
The dominance of NumericImpactResponses indicates that it plays a critical role in determining proximity-based classifications within the k-NN framework. Unlike tree-based models, k-NN relies on feature similarity, and the high importance score of NumericImpactResponses suggests that it provides the most discriminative power in defining neighborhood relationships.
The high importance of NumericImpactResponses across all models highlights its pivotal role in determining outcomes. This feature, which quantifies participants’ perceptions of smoking’s impact on relationships, likely reflects their personal and familial experiences. Its dominance in feature importance suggests that individuals are acutely aware of how smoking behaviors affect interpersonal dynamics, making this variable a strong predictor of relational patterns.
The secondary role of features such as NumericConflictResponses and NumericAcceptanceResponses across models suggests that while these factors contribute to relational dynamics, they are not as consistently or strongly predictive. Their lower importance indicates that conflict or acceptance dynamics may only manifest in specific familial contexts, making them less universally applicable predictors.
Among the models tested (Table 4), the ensemble model achieved the highest cross-validation accuracy of 93.33% and a minimal loss of 1.43%, effectively classifying complex relationships within the dataset. The decision tree classifier demonstrated moderate performance with an accuracy of 83.33% and a loss of 18.57%, identifying key decision points but struggling with intricate patterns. Similarly, the k-nearest neighbor (k-NN) model achieved an accuracy of 90% and a loss of 14.29%, demonstrating high sensitivity to localized data structures. The support vector machine (SVM) model underperformed, with a cross-validation accuracy of 80% and a loss of 44.29%, reflecting challenges in handling nonlinear and high-dimensional data.
To improve comparability between models, feature importance results were summarized across all four approaches—decision tree, SVM, ensemble, and k-NN—using both built-in importance metrics and permutation-based methods. As shown in Figure 6, Figure 7, Figure 8 and Figure 9, one feature—NumericImpactResponses—consistently dominated all models in predictive strength, followed by NumericConflictResponses and, in some models, FamilySmoking and Age. To synthesize these findings, Table 5 presents a side-by-side ranking of the top features by normalized importance score across models. Although the absolute values differ slightly depending on the method, the relative influence of the variables remains largely consistent.
The comparative feature importance table highlights both convergence and variability across models. All four algorithms consistently identified NumericImpactResponses as the most predictive feature, capturing participants’ perceived impact of smoking on family relationships. Additional variables such as NumericConflictResponses and FamilySmoking showed moderate influence in specific models (notably SVM and k-NN), while demographic factors such as age and gender had minimal impact. These results suggest that psychosocial and relational variables play a more central role in model predictions than basic demographic characteristics, reinforcing the importance of family dynamics in smoking behavior analysis.
The features that were most influential in supervised models also contributed significantly to separation in the PCA projection. NumericImpactResponses and ConflictResponses, which ranked highest across multiple algorithms, were also key drivers of visible clustering in the reduced two-dimensional feature space. This consistency between supervised and unsupervised analyses supports the robustness and interpretability of the selected predictors.
Confusion matrices (Figure 10) highlighted the strengths and weaknesses of these models. The ensemble model consistently minimized misclassification across all response categories, while the decision tree and k-NN models exhibited moderate misclassification rates, capturing general trends but struggling with ambiguous cases. The SVM model demonstrated significant classification errors, underscoring its limitations in capturing nuanced distinctions among complex response variables.

4. Discussion

Smoking significantly increases the risk of developing respiratory diseases, cardiovascular problems, and various forms of cancer, severely impacting overall health and longevity [32].
The findings of this study provide valuable insights into the socio-demographic, familial, and behavioral factors influencing smoking behaviors and cessation attempts. The lack of significant association between strong urges to smoke and cessation attempts ( χ 2 = 5.1532, p = 0.27194) challenges prior studies, such as Fidler et al. (2011) [33], which emphasized craving intensity as a predictor of cessation efforts. This suggests that a more comprehensive understanding of cessation predictors is necessary.
Nocturnal smoking behavior was significantly correlated with higher cigarette consumption ( χ 2 = 14.2456, p = 0.014122), reinforcing findings by Foulds et al. (2015) that such behavior reflects deeper addiction patterns [26]. The strong correlation between e-cigarette use and traditional smoking ( ρ = 0.67921, p = 8.0027 × 10−15) highlights the behavioral overlap and concurrent use of these products, as supported by Glantz and Bareham (2024) [32]. Age also played a critical role, with older participants initiating smoking at later ages ( ρ = 0.22546, p = 0.0241), underscoring age-related differences in smoking initiation and patterns. Gender differences in smoking type ( χ 2 = 10.63, p = 0.031) further emphasize the need to consider gender-specific interventions.
The negative correlation between the time to the first cigarette after waking and willingness to quit ( ρ = −0.30509, p = 0.002) suggests that earlier smoking upon waking may signify more entrenched addiction and lower cessation intent. Additionally, family dynamics significantly influenced smoking behaviors, with familial tension positively correlated with cigarette consumption ( ρ = 0.22 to 0.34). This finding aligns with Sharma et al. (2020), who highlighted the impact of family conflict on smoking [34].
Machine learning models provided additional insights. The ensemble model’s superior performance, with a cross-validation accuracy of 93.33%, demonstrates its robustness in identifying complex relationships within the dataset. This aligns with Dietterich (2000), who emphasized the effectiveness of ensemble methods in handling intricate data structures [35]. The decision tree classifier demonstrated moderate predictive power, with a cross-validation accuracy of 83.33% and a loss of 18.57%, reflecting its ability to identify key decision nodes but with limitations in handling intricate patterns. The k-nearest neighbor (k-NN) model achieved a cross-validation accuracy of 90.00% and a loss of 14.29%, suggesting high sensitivity to localized data structures. Similar studies in behavioral prediction have also noted such trade-offs (Chang, 2024) [36]. The SVM model’s lower performance (80.00% accuracy) reflects its challenges with mixed-variable datasets, consistent with findings by Guido et al. (2024) [37].
The feature importance analysis across all four models (ensemble, SVM, decision tree, and k-NN) revealed a consistent pattern, highlighting the dominant role of the variable NumericImpactResponses. This feature was the most influential in predicting perceived impact regardless of the algorithm used, suggesting a strong and stable relationship between participants’ responses regarding personal impact and the classification outcomes. In contrast, demographic variables such as gender, age, and smoking status contributed minimally across models, indicating that these factors were not key discriminators in this context. The ensemble and k-NN models concentrated importance almost exclusively on the NumericImpactResponses, while the SVM and decision tree models distributed minor importance to features such as NumericFamilySmoking and NumericConflictResponses, pointing to potential secondary influences. These findings emphasize the value of subjective self-assessment measures in predictive modeling over more traditional demographic or behavioral variables and support the robustness of NumericImpactResponses as a primary indicator across various algorithmic approaches.
Machine learning (ML) models have demonstrated substantial utility in predictive analysis based on smoking behavioral patterns, notably in the area of smoking cessation. The integration of ML models in public health strategies provides a sophisticated approach to understanding and intervening in smoking behaviors effectively.
Particularly in the field of smoking cessation, machine learning (ML) models have shown significant value in predicting analysis based on smoking behavioral patterns. A sophisticated method for comprehending and successfully addressing smoking behaviors is offered by the incorporation of machine learning algorithms into public health initiatives.
Recent studies have successfully used machine learning (ML) models to predict smokers’ cessation results and identified important factors that influence quitting. To predict smoking cessation among US citizens, for example, a study that used data from the Population Assessment of Tobacco and Health (PATH) used a variety of binary classifiers. This study demonstrated the ability of ML models to detect intricate associations within longitudinal data by identifying both known and new factors impacting the shift from current to former cigarette consumption [38].
Furthermore, applications that doctors and patients might use directly have been incorporated into the construction of prediction models for quitting smoking. These models provide tailored predictions that may dynamically respond to new data, modifying the expected success rates for quitting smoking in response to shifting input values [39]. This flexibility is essential in therapeutic settings because the unique qualities of each patient can have a big impact on how well they respond to treatment.
In this area, machine learning is being used to solve problems like class imbalance in datasets, where the proportion of people quitting smoking is much lower than that of those who do not. To improve the predicted accuracy and dependability of ML models, sophisticated strategies such as ensemble methods and random oversampling and undersampling have been used [38]. By balancing the dataset, these techniques make sure that the majority class is not favored by the predictive models.
Additionally, researchers can identify the elements that are most predictive of quitting smoking thanks to the variable importance and selection process in machine learning models. In order to guarantee that the models are trained on relevant, high-quality data, this procedure entails a great deal of data preparation and cleaning. The development of focused smoking cessation programs requires the identification of important predictors, such as nicotine dependency and the use of additional tobacco products [38,39]. In our work, we propose predictive models that can be relatively simply implemented into social smoking cessation programs based on patient interviews, which take into account factors studied by questionnaire, as in our case.
The findings of Hummel et al. (2017) [40] demonstrated that subjective measures of intention can be robust predictors of cessation behavior. Gallus et al. (2023) [41] studied the role of self-efficacy, a subjective self-assessment measure, in predicting smoking cessation among smokers attending a cessation program. The results confirmed that higher levels of self-efficacy were significant predictors of successful cessation, highlighting the importance of subjective confidence in one’s ability to quit over other demographic factors.
The performed PCA shows that self-reported perceptions can be meaningfully distinguished in reduced-dimensional space. This supports the interpretive value of subjective assessments in modeling family-related outcomes, even when working with a relatively small sample. The satisfactory result of this reduction further validates the relevance of the selected features and justifies their use in subsequent classification analyses. These findings align with previous research emphasizing the significance of self-reported perceptions in understanding smoking behaviors and their impacts on family dynamics. Previous studies have demonstrated that adolescents’ perceptions of smoking-related risks and benefits are closely associated with their smoking behaviors, which underscores the importance of subjective assessments in predicting smoking initiation and cessation [42]. Additionally, research has demonstrated that self-reported measures of nicotine dependence are effective predictors of smoking cessation outcomes, underscoring the value of subjective assessments in modeling smoking behaviors [43].
Overall, the results emphasize the need for targeted interventions addressing specific behavioral patterns, such as nocturnal smoking and early-morning cigarette use, as well as familial dynamics. The study’s results provide actionable insights that can guide social interventions. For instance, the positive correlation between family tension and cigarette consumption highlights the need to integrate family counseling or stress-reduction strategies into smoking cessation programs. Social campaigns should not only target individual behavior but also promote healthy communication and shared activities within households, especially those with dual users. Additionally, machine learning models developed in this study could be further adapted to identify individuals at high risk of dependence and tailor prevention strategies accordingly. The strong predictive performance of the ensemble model suggests its potential application in future behavioral studies. Further research is needed to refine machine learning approaches for health behavior prediction and to explore the nuanced roles of socio-demographic factors in smoking cessation.
ML models are a potent tool for studying the behavioral patterns linked to smoking and forecasting the results of quitting. These models can reveal complex patterns and correlations that conventional analytical techniques would overlook by using huge, varied datasets and advanced algorithms. The advancement of public health initiatives targeted at lowering smoking rates and enhancing the results for those trying to stop smoking depends on the ongoing development and use of machine learning techniques in this field [39].
This paper is not free of limitations. Although the findings are informative, the relatively small sample size of 100 participants presents a notable limitation. Small sample studies can yield useful insights, especially in exploratory research, but they limit the statistical power and generalizability of the results. In particular, subgroup-specific analyses—such as those based on gender or living environment—should be interpreted with caution. Future research should aim to recruit a larger and more diverse cohort to ensure greater external validity and robustness of the observed patterns.
This relatively small sample size, though sufficient for preliminary insights, imposes limitations on the generalizability of the findings. A small sample reduces the statistical power of the analyses and increases the likelihood of sampling bias, where the observed results may not accurately represent the broader population. This constraint is particularly relevant when using machine learning models, which typically perform better with larger datasets to capture complex patterns and interactions effectively. For instance, the ensemble learning model, which achieved a solid accuracy in this study, may demonstrate reduced robustness when applied to more diverse or significantly larger populations. It also affects the using of Principal Component Analysis (PCA). While PCA is a widely used method for exploratory data analysis and dimensionality reduction, its effectiveness can be limited in smaller datasets, where the stability and generalizability of the extracted components may be reduced. The PCA assumes linear relationships among variables and may not fully capture complex or nonlinear interactions relevant to the constructs studied. To address these limitations, future studies could consider using nonlinear dimensionality reduction techniques such as t-distributed Stochastic Neighbor Embedding (t-SNE) or Uniform Manifold Approximation and Projection (UMAP), which are often better suited to reveal subtle patterns and groupings in high-dimensional data, particularly in small to medium-sized samples.
Another limitation of this study is the use of Google Forms as the data collection tool, which, while accessible and efficient, may be subject to self-selection bias and limited control over respondent authenticity, potentially affecting the generalizability of the findings. Future research could strengthen reliability by incorporating follow-up interviews or triangulating responses with other data sources.

5. Conclusions

This study provides critical insights into the complex interplay between smoking behaviors, familial dynamics, and addiction patterns, emphasizing the importance of addressing these factors in public health interventions. Key findings reveal that gender differences, family tension, and the timing of smoking after waking are pivotal factors influencing smoking habits and cessation efforts. Gender-specific smoking product choices and the association between higher family tension and increased smoking frequency highlight the need for personalized strategies to support individuals in high-stress familial environments. The correlation between smoking soon after waking and reduced willingness to quit underscores the importance of early behavioral patterns in shaping cessation outcomes.
The analysis of familial dynamics revealed distinct differences among smoking groups. E-cigarette users reported stronger family support, higher levels of shared family time, and fewer conflicts compared to traditional cigarette smokers and dual users. However, they exhibited high nicotine dependence, with no participants in the e-cigarette or dual-use groups reporting an absence of dependence. Traditional cigarette smokers faced weaker familial bonds, marked by higher conflict levels and lower support. Dual users experienced the most challenging family dynamics, including the least shared time with family, coupled with the highest levels of nicotine dependence. These findings highlight the need for tailored interventions that simultaneously address familial challenges and addiction levels, particularly for dual users, who face compounded behavioral and relational difficulties.
Machine learning techniques, particularly ensemble models, demonstrated significant potential in analyzing the relationships between smoking behaviors and associated factors. The ensemble model outperformed traditional methods, such as decision trees and support vector machines (SVMs), achieving superior accuracy and effectively identifying complex patterns within the dataset [37]. However, the study’s relatively small sample size of 100 participants limits the generalizability of these findings. Small datasets reduce statistical power, increase sampling bias, and constrain the ability to identify subgroup-specific trends, such as those based on age, gender, or familial characteristics.
Future research should address these limitations by increasing the sample size to enhance statistical robustness and generalizability. Expanding the diversity of the sample population and incorporating additional variables, such as socioeconomic status, stress, coping mechanisms, and mental health factors, would provide a more comprehensive understanding of smoking behaviors. Furthermore, testing machine learning models on external datasets from diverse populations will validate their reliability in broader contexts. Advanced feature engineering and improvements in ensemble techniques could further refine model performance and uncover additional nuanced relationships.
In conclusion, this study highlights the profound influence of familial dynamics, addiction patterns, and demographic factors on smoking behaviors. It underscores the value of machine learning as a tool for behavioral research, offering a powerful means of analyzing complex datasets and informing targeted interventions. Public health strategies should consider the unique needs of different smoking groups. For example, interventions aimed at reducing family conflicts and promoting shared activities may benefit dual users, while cessation programs targeting e-cigarette and dual users should focus on mitigating high nicotine dependence. By addressing these multifaceted challenges, future efforts can more effectively reduce smoking prevalence and improve outcomes for individuals and their families.

Author Contributions

J.C.: Writing—original draft, Writing—review and editing, Visualization, Validation, Software, Methodology, Investigation, Formal analysis, Conceptualization. M.K.: Writing—review and editing, Writing—original draft, Methodology, Investigation, Formal analysis, Conceptualization. P.S.K.: Writing—original draft, Writing—review and editing, Investigation, Conceptualization, Supervision. R.D.: Writing—review and editing, Supervision, Conceptualization. A.F.: Writing—review and editing, Supervision, Conceptualization. R.J.D.: Writing—review and editing, Validation, Supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding. The Article Processing Charge was financed under the European Funds for Silesia 2021–2027 Program co-financed by the Just Transition Fund—project entitled “Development of the Silesian biomedical engineering potential in the face of the challenges of the digital and green economy (BioMeDiG)”. Project number: FESL.10.25-IZ.01-07G5/23.

Institutional Review Board Statement

The study was conducted in accordance with the Declaration of Helsinki, and approved by the Ethics of Committee of Medical University of Silesia (approval code: KNW/0022/KB1/79/I8).

Informed Consent Statement

Informed consent was obtained from all subjects involved in the study.

Data Availability Statement

The authors confirm that the data supporting the findings of this study are available upon request. The dataset is not publicly accessible, but it can be shared with interested researchers upon reasonable request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Thomas, P.A.; Liu, H.; Umberson, D. Family Relationships and Well-Being. Innov. Aging 2017, 1, igx025. [Google Scholar] [CrossRef] [PubMed]
  2. Ross, C.E.; Mirowsky, J. Family Relationships, Social Support and Subjective Life Expectancy. J. Health Soc. Behav. 2002, 43, 469. [Google Scholar] [CrossRef]
  3. McMahon, E.L.; Wallace, S.; Samuels, L.R.; Heerman, W.J. The relationships between resilience and child health behaviors in a national dataset. Pediatr. Res. 2024. [Google Scholar] [CrossRef] [PubMed]
  4. Kganyago Mphaphuli, L. The Impact of Dysfunctional Families on the Mental Health of Children. In Parenting in Modern Societies; IntechOpen: London, UK, 2023. [Google Scholar] [CrossRef]
  5. Steeger, C.M.; Bailey, J.A.; Epstein, M.; Hill, K.G. The link between parental smoking and youth externalizing behaviors: Effects of smoking, psychosocial factors, and family characteristics. Psychol. Addict. Behav. 2019, 33, 243–253. [Google Scholar] [CrossRef] [PubMed]
  6. Omasu, F.; Uemura, S.; Yukizane, S. The Impact of Family Relationships on the Smoking Habits of University Students. Open J. Prev. Med. 2015, 5, 14–22. [Google Scholar] [CrossRef]
  7. Wang, G.; Wu, L. Healthy People 2020: Social Determinants of Cigarette Smoking and Electronic Cigarette Smoking among Youth in the United States 2010–2018. Int. J. Environ. Res. Public Health 2020, 17, 7503. [Google Scholar] [CrossRef]
  8. Dietz, N.A.; Arheart, K.L.; Sly, D.F.; Lee, D.J.; McClure, L.A. Correlates of smoking among youth: The role of parents, friends, attitudes/beliefs, and demographics. Tob. Induc. Dis. 2016, 14, 9. [Google Scholar] [CrossRef]
  9. Mahabee-Gittens, E.M.; Harun, N.; Glover, M.; Folger, A.T.; Parikh, N.A.; Altaye, M.; Arnsperger, A.; Beiersdorfer, T.; Bridgewater, K.; Cahill, T.; et al. Prenatal tobacco smoke exposure and risk for cognitive delays in infants born very premature. Sci. Rep. 2024, 14, 1397. [Google Scholar] [CrossRef]
  10. McEvoy, C.T.; Spindel, E.R. Pulmonary Effects of Maternal Smoking on the Fetus and Child: Effects on Lung Development, Respiratory Morbidities, and Life Long Lung Health. Paediatr. Respir. Rev. 2017, 21, 27–33. [Google Scholar] [CrossRef]
  11. Barrington-Trimis, J.L.; Berhane, K.; Unger, J.B.; Cruz, T.B.; Urman, R.; Chou, C.P.; Howland, S.; Wang, K.; Pentz, M.A.; Gilreath, T.D.; et al. The E-cigarette Social Environment, E-cigarette Use, and Susceptibility to Cigarette Smoking. J. Adolesc. Health 2016, 59, 75–80. [Google Scholar] [CrossRef]
  12. Hubbard, G.; Gorely, T.; Ozakinci, G.; Polson, R.; Forbat, L. A systematic review and narrative summary of family-based smoking cessation interventions to help adults quit smoking. BMC Fam. Pract. 2016, 17, 73. [Google Scholar] [CrossRef]
  13. Srivastava, P.; Trinh, T.A. The effect of parental smoking on children’s cognitive and non-cognitive skills. Econ. Hum. Biol. 2021, 41, 100978. [Google Scholar] [CrossRef] [PubMed]
  14. Guo, Y.; Liu, D.y.; Wang, Y.j.; Huang, M.j.; Jiang, N.; Hou, Q.; Feng, B.; Wu, W.y.; Wu, Y.b.; Qi, F.; et al. Family functioning and nicotine dependence among smoking fathers: A cross-sectional study. BMC Public Health 2023, 23, 658. [Google Scholar] [CrossRef]
  15. Mohammadnezhad, M.; Tsourtos, G.; Wilson, C.; Ratcliffe, J.; Ward, P. Understanding Socio-cultural Influences on Smoking among Older Greek-Australian Smokers Aged 50 and over: Facilitators or Barriers? A Qualitative Study. Int. J. Environ. Res. Public Health 2015, 12, 2718–2734. [Google Scholar] [CrossRef] [PubMed]
  16. Egbe, C.O.; Petersen, I.; Meyer-Weitz, A.; Oppong Asante, K. An exploratory study of the socio-cultural risk influences for cigarette smoking among Southern Nigerian youth. BMC Public Health 2014, 14, 1204. [Google Scholar] [CrossRef] [PubMed]
  17. Wilkinson, A.V.; Shete, S.; Prokhorov, A.V. The moderating role of parental smoking on their children’s attitudes toward smoking among a predominantly minority sample: A cross-sectional analysis. Subst. Abus. Treat. Prev. Policy 2008, 3, 18. [Google Scholar] [CrossRef]
  18. Hiscock, R.; Dobbie, F.; Bauld, L. Smoking Cessation and Socioeconomic Status: An Update of Existing Evidence from a National Evaluation of English Stop Smoking Services. BioMed Res. Int. 2015, 2015, 274056. [Google Scholar] [CrossRef]
  19. Waters, A.; Kendzor, D.; Roys, M.; Stewart, S.; Copeland, A. Financial strain mediates the relationship between socioeconomic status and smoking. Tob. Prev. Cessat. 2019, 5, 3. [Google Scholar] [CrossRef]
  20. Pokhrel, P.; Herzog, T.A.; Muranaka, N.; Regmi, S.; Fagan, P. Contexts of cigarette and e-cigarette use among dual users: A qualitative study. BMC Public Health 2015, 15, 859. [Google Scholar] [CrossRef]
  21. Martinelli, T.; Candel, M.J.J.M.; de Vries, H.; Talhout, R.; Knapen, V.; van Schayck, C.P.; Nagelhout, G.E. Exploring the gateway hypothesis of e-cigarettes and tobacco: A prospective replication study among adolescents in the Netherlands and Flanders. Tob. Control 2021, 32, 170–178. [Google Scholar] [CrossRef]
  22. Wang, J.W.; Cao, S.S.; Hu, R.Y. Smoking by family members and friends and electroniccigarette use in adolescence: A systematic review and metaanalysis. Tob. Induc. Dis. 2018, 16, 5. [Google Scholar] [CrossRef] [PubMed]
  23. Du, Y.; Palmer, P.H.; Sakuma, K.L.; Blake, J.; Johnson, C.A. The association between family structure and adolescent smoking among multicultural students in Hawaii. Prev. Med. Rep. 2015, 2, 206–212. [Google Scholar] [CrossRef]
  24. Eslava, D.; Martínez-Vispo, C.; Villanueva-Blasco, V.J.; Errasti-Pérez, J.M.; Al-Halabí, S. Family Conflict and the Use of Conventional and Electronic Cigarettes in Adolescence: The Role of Impulsivity Traits. Int. J. Ment. Health Addict. 2022, 21, 3885–3896. [Google Scholar] [CrossRef]
  25. Hill, K.G.; Hawkins, J.D.; Catalano, R.F.; Abbott, R.D.; Guo, J. Family influences on the risk of daily smoking initiation. J. Adolesc. Health 2005, 37, 202–210. [Google Scholar] [CrossRef] [PubMed]
  26. Foulds, J.; Veldheer, S.; Yingst, J.; Hrabovsky, S.; Wilson, S.J.; Nichols, T.T.; Eissenberg, T. Development of a Questionnaire for Assessing Dependence on Electronic Cigarettes Among a Large Sample of Ex-Smoking E-cigarette Users. Nicotine Tob. Res. 2014, 17, 186–192. [Google Scholar] [CrossRef]
  27. Children Whose Parents Smoke Are 4 Times as Likely to Take Up Smoking Themselves—gov.uk. Available online: https://www.gov.uk/government/news/children-whose-parents-smoke-are-four-times-as-likely-to-take-up-smoking-themselves (accessed on 2 April 2025).
  28. Alves, J.; Perelman, J.; Soto-Rojas, V.; Richter, M.; Rimpelä, A.; Loureiro, I.; Federico, B.; Kuipers, M.A.; Kunst, A.E.; Lorant, V. The role of parental smoking on adolescent smoking and its social patterning: A cross-sectional survey in six European cities. J. Public Health 2016, 39, 339–346. [Google Scholar] [CrossRef]
  29. Sanchez, S.; Kaufman, P.; Pelletier, H.; Baskerville, B.; Feng, P.; O’Connor, S.; Schwartz, R.; Chaiton, M. Is vaping cessation like smoking cessation? A qualitative study exploring the responses of youth and young adults who vape e-cigarettes. Addict. Behav. 2021, 113, 106687. [Google Scholar] [CrossRef]
  30. Kim, S.; Gil, M.; Kim-Godwin, Y. Development and Validation of the Family Relationship Assessment Scale in Korean College Students’ Families. Fam. Process 2020, 60, 586–601. [Google Scholar] [CrossRef]
  31. Yingst, J.; Foulds, J.; Hobkirk, A.L. Dependence and Use Characteristics of Adult JUUL Electronic Cigarette Users. Subst. Use Misuse 2020, 56, 61–66. [Google Scholar] [CrossRef]
  32. Glantz, S.A.; Bareham, D.W. E-Cigarettes: Use, Effects on Smoking, Risks, and Policy Implications. Annu. Rev. Public Health 2024, 39, 215–235. [Google Scholar] [CrossRef]
  33. Fidler, J.A.; West, R. Enjoyment of smoking and urges to smoke as predictors of attempts and success of attempts to stop smoking: A longitudinal study. Drug Alcohol Depend. 2011, 115, 30–34. [Google Scholar] [CrossRef] [PubMed]
  34. Sharma, R.; Martins, N.; Tripathi, A.; Caponnetto, P.; Garg, N.; Nepovimova, E.; Kuča, K.; Prajapati, P.K. Influence of Family Environment and Tobacco Addiction: A Short Report from a Post-Graduate Teaching Hospital, India. Int. J. Environ. Res. Public Health 2020, 17, 2868. [Google Scholar] [CrossRef]
  35. Dietterich, T.G. Ensemble Methods in Machine Learning. In Multiple Classifier Systems; Springer: Berlin/Heidelberg, Germany, 2000; pp. 1–15. [Google Scholar] [CrossRef]
  36. Chang, X. Comparative Analysis of Machine Learning, Decision Trees, and K-Nearest Neighbors for Heart Disease Prediction. Appl. Comput. Eng. 2024, 82, 188–192. [Google Scholar] [CrossRef]
  37. Guido, R.; Ferrisi, S.; Lofaro, D.; Conforti, D. An Overview on the Advancements of Support Vector Machine Models in Healthcare Applications: A Review. Information 2024, 15, 235. [Google Scholar] [CrossRef]
  38. Issabakhsh, M.; Sánchez-Romero, L.M.; Le, T.T.T.; Liber, A.C.; Tan, J.; Li, Y.; Meza, R.; Mendez, D.; Levy, D.T. Machine learning application for predicting smoking cessation among US adults: An analysis of waves 1–3 of the PATH study. PLoS ONE 2023, 18, e0286883. [Google Scholar] [CrossRef] [PubMed]
  39. Lai, C.C.; Huang, W.H.; Chang, B.C.C.; Hwang, L.C. Development of Machine Learning Models for Prediction of Smoking Cessation Outcome. Int. J. Environ. Res. Public Health 2021, 18, 2584. [Google Scholar] [CrossRef] [PubMed]
  40. Hummel, K.; Candel, M.J.J.M.; Nagelhout, G.E.; Brown, J.; van den Putte, B.; Kotz, D.; Willemsen, M.C.; Fong, G.T.; West, R.; de Vries, H. Construct and Predictive Validity of Three Measures of Intention to Quit Smoking: Findings from the International Tobacco Control (ITC) Netherlands Survey. Nicotine Tob. Res. 2017, 20, 1101–1108. [Google Scholar] [CrossRef]
  41. Gallus, S.; Cresci, C.; Rigamonti, V.; Lugo, A.; Bagnardi, V.; Fanucchi, T.; Cirone, D.; Ciaccheri, A.; Cardellicchio, S. Self-efficacy in predicting smoking cessation: A prospective study from Italy. Tob. Prev. Cessat. 2023, 9, 15. [Google Scholar] [CrossRef]
  42. Morrell, H.E.R.; Song, A.V.; Halpern-Felsher, B.L. Predicting adolescent perceptions of the risks and benefits of cigarette smoking: A longitudinal investigation. Health Psychol. 2010, 29, 610–617. [Google Scholar] [CrossRef]
  43. Kozlowski, L.T.; Porter, C.Q.; Orleans, C.; Pope, M.A.; Heatherton, T. Predicting smoking cessation with self-reported measures of nicotine dependence: FTQ, FTND, and HSI. Drug Alcohol Depend. 1994, 34, 211–216. [Google Scholar] [CrossRef]
Figure 1. The stages of the research.
Figure 1. The stages of the research.
Applsci 15 04442 g001
Figure 2. PCA projection using the top six most discriminative features. The plot shows class clustering based on participants’ responses regarding the perceived impact of smoking on family relationships. Although class separation is not complete, the projection highlights informative structure in the data.
Figure 2. PCA projection using the top six most discriminative features. The plot shows class clustering based on participants’ responses regarding the perceived impact of smoking on family relationships. Although class separation is not complete, the projection highlights informative structure in the data.
Applsci 15 04442 g002
Figure 3. Self-reported effects of smoking by age at smoking onset.
Figure 3. Self-reported effects of smoking by age at smoking onset.
Applsci 15 04442 g003
Figure 4. Self-reported smoking frequency among participants.
Figure 4. Self-reported smoking frequency among participants.
Applsci 15 04442 g004
Figure 5. Decision tree for predicting outcomes based on NumericImpactResponses and NumericConflictResponses.
Figure 5. Decision tree for predicting outcomes based on NumericImpactResponses and NumericConflictResponses.
Applsci 15 04442 g005
Figure 6. The feature importance analysis by decision tree.
Figure 6. The feature importance analysis by decision tree.
Applsci 15 04442 g006
Figure 7. The feature importance analysis by ensemble method.
Figure 7. The feature importance analysis by ensemble method.
Applsci 15 04442 g007
Figure 8. Feature importance analysis by SVM.
Figure 8. Feature importance analysis by SVM.
Applsci 15 04442 g008
Figure 9. The feature importance analysis by k-NN.
Figure 9. The feature importance analysis by k-NN.
Applsci 15 04442 g009
Figure 10. Confusion matrices of ML models in sequence: decision tree confusion matrix, SVM confusion matrix, ensemble model confusion matrix, and k-NN confusion matrix. Color intensity of each cell represents the relative frequency of predictions. Darker shades of blue indicate a higher number of correctly classified instances, typically seen along the diagonal. Conversely, light blue to white indicates fewer correct predictions, while light red hues highlight misclassifications, positioned off the diagonal.
Figure 10. Confusion matrices of ML models in sequence: decision tree confusion matrix, SVM confusion matrix, ensemble model confusion matrix, and k-NN confusion matrix. Color intensity of each cell represents the relative frequency of predictions. Darker shades of blue indicate a higher number of correctly classified instances, typically seen along the diagonal. Conversely, light blue to white indicates fewer correct predictions, while light red hues highlight misclassifications, positioned off the diagonal.
Applsci 15 04442 g010
Table 1. Participant demographics and smoking preferences (N = 100).
Table 1. Participant demographics and smoking preferences (N = 100).
VariableCategoriesFrequency (%)
GenderMale/Female/Other44/54/2
AgeMean (SD): 23.4 (4.6); Range: 18–39
Living EnvironmentRural/<50 k/50–150 k/150–500 k/>500 k12/9/15/30/34
Educational AttainmentPrimary/Lower Secondary/Vocational/Secondary/University2/6/12/42/38
Smoking BehaviorElectronic/Traditional/Both43/21/36
Table 2. Comparison of family relationship indicators (FRAC) and nicotine dependence (PSECDI) across smoking groups.
Table 2. Comparison of family relationship indicators (FRAC) and nicotine dependence (PSECDI) across smoking groups.
CategoryIndicatorE-Cigarette SmokersTraditional SmokersDual Users
Family Support (mean)4.223.503.67
FRACFamily Conflict (mean)1.812.001.89
Family Togetherness (mean)3.263.122.83
No Dependence (%)0.004.760.00
Low Dependence (%)21.0533.3316.67
PSECDIModerate Dependence (%)31.5819.0533.33
High Dependence (%)47.3742.8650.00
PSECDI Score (mean)13.1112.0513.50
Table 3. Analysis of factors influencing smoking behaviors and family dynamics.
Table 3. Analysis of factors influencing smoking behaviors and family dynamics.
VariablesTestResultsp-ValueInterpretation
Participant age and type of smoking productANOVAF = 8.79p < 0.001Significant differences in smoking products among age groups
Education level and type of smoking productChi-Square χ 2 = 6.410.601No significant association
Gender and type of smoking productChi-Square χ 2 = 10.630.031Significant differences between genders in smoking products used
Participant age and age of smoking initiationSpearman Correlation ρ = 0.230.024Older participants started smoking at later ages
Place of residence and type of smoking productChi-Square χ 2 = 7.020.534No significant association
Family history of smoking and type of smoking productChi-Square χ 2 = 2.000.368No significant association
Perceived impact of smoking on relationships and number of cigarettes smokedSpearman Correlation ρ = −0.150.135Weak negative correlation, not statistically significant
Smoking frequency and willingness to quitSpearman Correlation ρ = −0.070.485Very weak negative correlation, not significant
Time to first cigarette after waking and willingness to quitSpearman Correlation ρ = −0.310.002Moderate negative correlation, statistically significant
Family conflict score and willingness to quitChi-Square χ 2 = 5.150.272No significant association
Family support score and willingness to quitLogistic RegressionCoefficients: 1.37, −0.18p = 0.058, p = 0.300Trend toward significance, but not significant
Family tension score and number of cigarettes smoked per daySpearman Correlation ρ = 0.22–0.340.029–0.001Significant positive correlation
Family acceptance score and smoking frequencySpearman Correlation ρ = −0.09 to −0.080.334–0.467No significant correlation
Quit attempts and urge to smokeChi-Square χ 2 = 5.150.272No significant correlation
Nocturnal smoking behavior and smoking frequencyChi-Square χ 2 = 14.250.014Significant association
Current e-cigarette use and prior traditional cigarette useChi-Square χ 2 = 8.30p = 0.00396Statistically significant association between variables
Table 4. Performance evaluation of the models.
Table 4. Performance evaluation of the models.
ModelAccuracy (%)PrecisionRecallF1 Score
Decision Tree *83.330.790.700.74
Ensemble Method *93.330.910.910.91
SVM *80.000.600.750.67
k-NN *90.000.900.820.86
* Top feature: NumericImpactResponses for all models.
Table 5. Comparative feature importance across machine learning models (normalized scores).
Table 5. Comparative feature importance across machine learning models (normalized scores).
FeatureDecision TreeEnsembleSVMk-NN
Gender0.00000.00850.00000.0000
Age0.00000.00000.01670.0000
Do you smoke0.00000.00000.00000.0000
CleanedSmokingFrequency0.00000.00230.00000.0000
When did you start smoking0.00000.00310.00670.0000
NumericFamilySmoking0.00000.00000.01330.0000
NumericConflictResponses0.03810.00000.00000.0080
NumericImpactResponses0.18750.14590.18670.2433
NumericAcceptanceResponses0.00000.00000.00000.0000
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Chwał, J.; Kostka, M.; Kostka, P.S.; Dzik, R.; Filipowska, A.; Doniec, R.J. Analysis of Demographic, Familial, and Social Determinants of Smoking Behavior Using Machine Learning Methods. Appl. Sci. 2025, 15, 4442. https://doi.org/10.3390/app15084442

AMA Style

Chwał J, Kostka M, Kostka PS, Dzik R, Filipowska A, Doniec RJ. Analysis of Demographic, Familial, and Social Determinants of Smoking Behavior Using Machine Learning Methods. Applied Sciences. 2025; 15(8):4442. https://doi.org/10.3390/app15084442

Chicago/Turabian Style

Chwał, Joanna, Małgorzata Kostka, Paweł Stanisław Kostka, Radosław Dzik, Anna Filipowska, and Rafał Jan Doniec. 2025. "Analysis of Demographic, Familial, and Social Determinants of Smoking Behavior Using Machine Learning Methods" Applied Sciences 15, no. 8: 4442. https://doi.org/10.3390/app15084442

APA Style

Chwał, J., Kostka, M., Kostka, P. S., Dzik, R., Filipowska, A., & Doniec, R. J. (2025). Analysis of Demographic, Familial, and Social Determinants of Smoking Behavior Using Machine Learning Methods. Applied Sciences, 15(8), 4442. https://doi.org/10.3390/app15084442

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop