An Integrated Framework for Architectural Visual Assessment: Validation of Visual Equilibrium Using Fractal Analysis and Subjective Perception

Aloshan, Mohammed A.; Sanad, Ehab Momin Mohammed

doi:10.3390/buildings16020345

Open AccessArticle

An Integrated Framework for Architectural Visual Assessment: Validation of Visual Equilibrium Using Fractal Analysis and Subjective Perception

by

Mohammed A. Aloshan

^1,*

and

Ehab Momin Mohammed Sanad

²

¹

Department of Architectural Engineering, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh 11432, Saudi Arabia

²

Department of Architecture, College of Architecture and Planning, Qassim University, Qassim 52571, Saudi Arabia

^*

Author to whom correspondence should be addressed.

Buildings 2026, 16(2), 345; https://doi.org/10.3390/buildings16020345

Submission received: 31 October 2025 / Revised: 17 December 2025 / Accepted: 23 December 2025 / Published: 14 January 2026

(This article belongs to the Special Issue Advanced Studies in Urban and Regional Planning—2nd Edition)

Download

Browse Figures

Versions Notes

Abstract

In recent decades, multiple approaches have emerged to assess architectural visual character, including fractal dimension analysis, visual equilibrium calculations, and visual preference surveys. However, the relationships among these methods and their alignment with subjective perception remain unclear. This study applies all three techniques to sample mosques in Riyadh, Saudi Arabia, to evaluate their validity and interconnections. Findings reveal a within-sample tendency toward low visual complexity, with fractal dimensions ranging from 1.2 to 1.547. Within this small, exploratory sample of five large main-road mosques in Riyadh, correlations between computed visual equilibrium and survey results provide preliminary, sample-specific convergent-validity evidence for Larrosa’s visual-forces method, rather than general validation. Within this sample, traditional façades with separate minarets tended to score as more visually balanced than more contemporary compositions. This triangulated approach offers an exploratory framework for architectural visual assessment that integrates objective metrics with human perception.

Keywords:

visual assessment; fractal geometry; visual complexity; visual perception; visual equilibrium

1. Introduction

Visual assessment is a critical tool in understanding architectural character and is essential for guiding design and preserving urban identity. Given its inherently subjective nature, researchers have developed quantitative techniques to improve objectivity, including parsing methods for façade pattern analysis [1], perceptual mathematical models [2], and fractal dimension (FD) calculations first introduced by Carl Bovill in 1996 [3]. These computational approaches vary across visual complexity, perceptual balance, statistical components, and image segmentation. Focusing on Riyadh’s mosque architecture—central to the city’s identity and cultural heritage—this study integrates objective metrics with subjective evaluation to ensure that computational results resonate with human visual experience and aesthetic judgment [4]. Riyadh has a distinct architectural character rooted in the Najdi region, whose vernacular mud-brick and stone architecture is characterized by compact courtyard layouts, thick walls, simple volumes, and distinctive triangular or geometric openings and crenellation [5,6]. Contemporary architecture in Riyadh has, to some extent, preserved and reinterpreted this identity in new mosque and civic projects, using traditional Najdi elements within modern materials and forms [7,8,9]. In this study, we apply computational visual assessment to a small set of contemporary mosque buildings in Riyadh, built within approximately the last two decades, as an exploratory case study of this recent phase. Throughout, human observers remain central: computational measures are interpreted only in relation to expert and advanced-student judgments and are proposed as aids to, not substitutes for, the evaluation of visual character.

1.1. Research Problem and Gap

Although fractal dimension has been widely used and calibrated as a measure of visual complexity, interpretation still requires contextual understanding. By contrast, visual equilibrium has only recently been operationalized by Larrosa’s perceptual visual-force model, and published applications provide only a small number of case studies and limited-sample comparisons to human judgments [2,10], in sharp contrast to the much larger set of FD studies that link complexity scores to perception [11,12,13,14,15]. Current design review and legislation demand methods that are both objective and practically interpretable; however, the relationship between visual complexity and visual equilibrium, and the extent to which it aligns with human preference, remains under-examined. A comparative, multi-method framework is therefore needed to clarify these relationships and to test the validity of perceptual models against established complexity metrics and subjective judgments.

1.2. Objectives

This exploratory pilot study examines five large mosque façades on main roads in Riyadh and triangulates three techniques: (i) quantitative fractal-dimension analysis for visual complexity, (ii) quantitative visual-force calculations for visual equilibrium (Larrosa), and (iii) qualitative/subjective visual preference. The aim of this research is not to classify mosque types in Riyadh or to map the full diversity of mosque styles, but rather to explore the application of a triangulated visual-assessment framework—combining fractal dimension (FD), Larrosa’s visual equilibrium (VW), and expert perception—using a small set of façades as a testbed for comparing the three methods. Specifically, we:

Quantify visual complexity and visual equilibrium for representative mosque façades in Riyadh;
Provide initial convergent evidence for Larrosa’s visual-force method by testing its correlation with subjective perceptions of visual balance in an expert sample of architects and advanced students;
Investigate the relationship between visual equilibrium and visual complexity by comparing Larrosa’s outputs with FD results;
Establish a comprehensive, multi-method system of visual inquiry tailored to Riyadh’s architectural context.

Working propositions. Because openings/voids reduce both visual weight and FD while projections/solids have the opposite effect, we anticipate systematic association between equilibrium and complexity at the façade scale. Throughout, FD and VW are treated not as stand-alone descriptors but as candidate quantitative summaries whose usefulness is evaluated against expert perceptual judgments.

1.3. Contributions

This paper offers four contributions:

Exploratory, sample-specific convergent-validity evidence for Larrosa’s visual-equilibrium model in this convenience sample of experts and advanced students, via correlations with subjective preference and FD, without claiming general validation beyond the present façades.
A comparative account of visual equilibrium (visual weight) and visual complexity (FD) for the same buildings.
An integrated, multi-method quantitative framework for assessing Riyadh’s Mosque architecture.
An illustrative visual-assessment framework that can be adapted to other building types in future studies.

1.4. Scope and Limitations

The study does not measure beauty or psychological impact per se. Rather, it computes visual complexity and visual equilibrium and compares them with subjective impressions to understand Riyadh’s visual character. Religious buildings are used as stimuli because their salience supports reliable subjective judgments; “beauty” serves only as a sampling rationale for spiritually impactful cases.

The empirical component of this study is based on a deliberately small, purposive sample of five large mosques located on primary roads in Riyadh. These buildings were chosen because (i) their façades are highly visible in the everyday visual field of city users, (ii) their unobstructed main elevations enable reliable and comparable FD and visual-equilibrium calculations, (iii) they include minarets that create strong vertical–horizontal proportioning relevant to Larrosa’s approach, and (iv) CAD drawings and models were accessible for manual analysis. No formal city-wide sampling frame of “large, main-road mosques” was available at the time of the study; therefore, the five cases should be regarded as a convenience-based illustrative subset, not a statistically representative subsample of Riyadh’s mosque stock. Any statements about “Riyadh’s mosque architecture” in this paper should therefore be read as referring specifically to this subset of large, visually prominent main-road mosques. The findings provide within-sample evidence about the relationships between FD, visual equilibrium, and subjective preference, but do not estimate the full distribution of visual characteristics across the city’s mosque stock or the wider population of large main-road mosques; the five study sites are treated as an illustrative case series within this visually salient subtype. In addition, because brief construct explanations were provided immediately before the rating tasks, metric–survey correspondences should be read as preliminary, exploratory convergent evidence within this specific sample rather than as independent validation of Larrosa’s method in mosque architecture more broadly.

The remainder of this paper presents a brief literature review (Section 2), the methods for FD, visual equilibrium, and the survey (Section 3), the empirical results and triangulated analysis (Section 4), and conclusions and directions for future research (Section 5).

2. Literature Review

Religious buildings have historically shaped the visual and spiritual identity of cities and civilizations [16,17]. In Saudi Arabia, mosques are central to urban heritage, and in the Najd region—particularly Riyadh—distinctive compositional traits convey clarity, openness, and multifunctionality that link tradition with a forward-looking urban vision [9].

Although religious architecture is materially constructed, it often evokes spiritually inflected experiences [18]; this underlines the cognitive–perceptual dimensions of how observers engage with architectural form. Classic accounts of the sublime (Kant; Burke) describe awe and boundlessness when confronting works perceived as powerful or vast [19,20], a response frequently reported in religious settings [18]. This strong subjective salience motivates the use of visual preference as a benchmark when evaluating computational visual-assessment methods in religious architecture.

2.1. Visual Perception

Vision typically dominates multisensory appraisal of the built environment [21] and involves both reception and higher-order cognition [22]. Building on a Kantian reading of Gestalt theory, Arnheim emphasized that vision is not a mechanical recording but an active organization of sensory material according to principles such as simplicity, regularity, and balance [4,23]. Accordingly, architectural perception privileges structural patterns over isolated elements, which supports the development of computational measures aimed at capturing those higher-order configurations.

2.2. Computational Visual Assessment

To reduce subjective bias, researchers have developed computational approaches spanning:

(i) Image parsing and processing, e.g., line segmentation, edge detection, component-based façade parsing, and Hough-transform-aided feature extraction [1,24,25,26]; (ii) visibility analyses, such as façade isovists [27]; and (iii) metric models that quantify particular constructs, notably fractal dimension (FD) for visual complexity and perceptual visual-force models for equilibrium [2,11]. These method families differ in what they measure (complexity, balance, visibility), how inputs are segmented, and how results are interpreted for design review and codes.

In parallel with these classical image-processing and metric-based approaches, recent years have seen rapid growth in computer-vision and deep-learning methods applied to façades and urban scenes. End-to-end convolutional neural networks now perform façade parsing and pixel-wise semantic segmentation of façade components (e.g., windows, doors, balconies, cornices), enabling automated extraction of architectural elements from street-level imagery for large datasets [28]. Building on such pipelines, subsequent work has focused on deep-learning models for detecting and quantifying façade elements and opening patterns, which supports tasks such as window-to-wall ratio estimation and façade-element statistics at the district or city scale [29].

Beyond individual buildings, CNN-based frameworks have been used to measure façade color distributions and functional classifications from street-view images at city scale, and to reconstruct 3D façade geometry for urban modeling and heritage documentation [30]. Synthetic reviews of computer-vision analysis of buildings and the built environment emphasize that these deep-learning approaches excel at large-scale automation and scene understanding yet often rely on substantial labeled datasets and operate as high-dimensional ‘black-box’ predictors whose internal representations are not easily interpretable by designers [31]. For design-review panels and code authorities, this opacity is a practical barrier: any quantitative index used in approvals or guidelines must be explainable, traceable to visible geometric relations on the façade, and expressible as simple ranges or thresholds that can be checked directly on drawings.

Within this broader landscape, the present study does not propose a new deep-learning architecture. Instead, it focuses on a complementary problem: testing whether interpretable, low-dimensional metrics—fractal-dimension complexity and Larrosa’s proportion-based visual equilibrium—align with experts’ and students’ subjective judgments for a coherent architectural typology (Riyadh mosques). While FD has already been used as an interpretable scalar index of visual complexity in numerous architectural and perceptual studies, Larrosa’s equilibrium formulation has so far appeared only in a small number of case studies and limited sample applications and has not yet been calibrated against expert judgments across building types. This imbalance helps explain why equilibrium remains under-explored relative to complexity, despite its close conceptual connection to design talk of balance and visual weight. By triangulating FD, equilibrium scores, and visual-preference data on the same façades, the study provides validation evidence for perceptual models that can be directly read and adjusted by designers and regulators. These transparent metrics are intended to complement, rather than replace, deep-learning pipelines by offering architecturally legible indicators that can inform design review and code development in contexts where large training datasets and high-end computational infrastructure may not be available.

2.3. Fractal Dimension as a Measure of Visual Complexity

“Mandelbrot’s” fractal concept formalized scale-dependent self-similarity in natural and man-made patterns [32,33,34]. This mathematical representation has enabled the creation of complex images such as the Mandelbrot set and Julia sets. In architecture, FD—commonly estimated via box-counting—has been interpreted as a proxy for visual complexity [12]. Early applications by Bovill quantified characteristic façade complexity (e.g., the Robie House) [3,17], while subsequent work refined scaling protocols and stressed careful image preparation (thresholding, segmentation, iteration depth) to improve reliability (e.g., Ostwald and Tucker) [12,17,35]. Studies increasingly combine manual and digital workflows (e.g., AutoCAD, version 2026; Autodesk, 2025-assisted box counting) to operationalize FD across typologies [36]. The consensus is that FD is robust for relative comparisons of complexity, yet its perceptual meaning is context-dependent; FD magnitudes require interpretation against typology, composition, and viewing scale [11,12,13]. Figure 1 illustrates the box-counting procedure applied to one of the study mosque façades at three grid sizes, corresponding to the course (FD1), intermediate (FD2), and fine (FD3) scale bands used in this study.

2.4. Perceptual Visual Forces and Visual Equilibrium

Building on Arnheim’s visual dynamics—where perception organizes form according to principles such as balance and the “upward thrust” of verticals—Larrosa formalized a computational framework that models visual equilibrium as the resultant of two opposing, proportion-driven forces: an ascending (principal) force associated with upward thrust and a descending (complementary) force associated with perceived visual weight [4]. Conceptually, a façade approaches equilibrium when the algebraic sum of these forces is near zero; large negative values indicate a heavy, ground-piercing tendency, while large positive values indicate a light, buoyant tendency [2,10]. Although operational and attractive for design review, Larrosa’s formulation has received comparatively less empirical validation against human judgments than fractal-dimension (FD) approaches—hence the value of studies that place equilibrium measures alongside FD and preference data [2,10]. Whereas FD has been examined in numerous perceptual studies that relate its values to preference and other visual responses [11,12,13,14,15], empirical tests of Larrosa’s equation remain limited to a handful of examples and small-scale evaluations, with no broad calibration across building types.

Conceptual example (Figure 2). Variations in column proportion and entablature depth illustrate how perceived balance shifts with geometric relations. In a didactic trio, a 1:9 column with an oversized entablature yields a net negative (heavier) outcome; a 1:8 proportion approximates equilibrium; and a 1:7 with a shallower entablature produces a net positive (lighter) outcome—aligning with Arnheim’s account of counter-tension between vertical and horizontal components [4].

2.5. Visual Preference as Validation Evidence

In environmental design research, visual preference is frequently employed to assess whether computed visual metrics show patterns that are consistent with perception, serving as an external perceptual check rather than a definitive criterion of truth for objective measures [11]. Prior work reports systematic relationships between complexity and preference; for example, Abboushi found peak visual interest at fractal dimension (FD) values of approximately 1.5–1.7 when testing architecture students [15], while Hussein showed a direct effect of manipulated visual complexity on preference ratings [14]. Methodologically, studies often use two-dimensional projections to control parameters and isolate visual variables, which improves internal validity and analytic precision [11]. Taken together, this literature supports using preference data as an independent perceptual benchmark when interpreting computational assessments (e.g., FD-based complexity and equilibrium scores). Preference ratings help relate subjective appraisal to objective quantification, while recognizing that any correspondence is partial, context-dependent, and does not constitute a general validation of the metrics [11,14,37].

2.6. Gap and Rationale

Across these strands, two linked gaps persist: (i) equilibrium metrics remain empirically under-validated relative to FD, and (ii) few studies jointly compare FD and equilibrium on the same façades while testing both against subjective preference. Addressing this gap motivates the present triangulated design that relates FD-based complexity, proportion-based equilibrium (Larrosa), and human judgments within a coherent typological sample (Riyadh mosques), as detailed in the Methods section.

3. Methodology

This study implements a triangulated, mixed-methods design to (i) quantify visual complexity via fractal dimension (FD), (ii) quantify visual equilibrium via Larrosa’s perceptual visual-force formulation, and (iii) obtain visual-preference judgments, then statistically examine their relationships on the same set of mosque façades in Riyadh. The workflow comprises sample selection and image acquisition, standardized elevation representation (Level 3), FD computation, equilibrium computation and aggregation, and an external validation survey, followed by correlation analyses among metrics and survey outcomes. For these façade-level associations (n = 5), we summarize correspondence using Spearman’s ρ as descriptive effect sizes; no multiple-testing corrections are applied, and survey predictors are façade-level mean ratings rather than individual-level covariates (Figure 2, Figure 3 and Figure 4; Table 1).

3.1. Study Sites, Sampling Frame, and Inclusion Criteria

Mosques were sampled from primary roads in Riyadh to reflect the city’s everyday visual field. Inclusion criteria: (i) location on a main artery; (ii) comparatively large scale and clear street presence (often with dual minarets); (iii) architectural diversity spanning traditional and contemporary idioms; (iv) unobstructed visibility of the principal road-facing elevation. Exclusion criteria: substantial occlusions (e.g., trees, scaffolding) covering >20% of key elevation features; nighttime images; severe perspective obstruction. Final selections and map links are listed in Table 1. Because no comprehensive, type-coded inventory of Riyadh mosques was available, we did not construct a formal statistical sampling frame or estimate the total number of eligible large main-road mosques; instead, the five cases in Table 1 were purposively chosen to span traditional and more contemporary compositions within the inclusion criteria. In future work, we will extend the sampling frame to neighborhood-scale mosques across multiple Riyadh districts—including single-minaret and attached-minaret types and varied courtyard configurations—to test generalizability beyond major arterial sites.

3.2. Image Acquisition and Elevation Representation (Level 3)

Each mosque was photographed orthogonally to the main road façade under daylight with comparable illumination. Images were rectified (keystone/perspective correction as needed) and cropped to the principal elevation. Following Ostwald’s five-level representation scheme, we used Level 3 (overall detail beyond openings, short of material texture) to standardize drawings for analysis, given prior evidence that Levels 3 and 4 yield similar FD results while Level 3 reduces preprocessing variability [38]. The façade samples were re-generated as monochrome AutoCAD (version 2026, Autodesk Inc., San Francisco, CA, USA) elevation drawings; no image-based thresholding or other digital pre-processing was applied. Edges were manually traced in AutoCAD (including primary outlines, openings and cornices, while excluding signage, overhead wires, and temporary objects), and all lines were standardized to a line width of 0.20 mm before applying the six box-counting grid sizes.

3.3. Fractal Dimension (FD) Computation

Fractal dimension (FD) was used to operationalize visual complexity and was computed via box-counting on the Level-3 elevation drawings [11,13,39]. FD was calculated for (a) each mosque elevation and (b) the adjacent skyline window for contextual comparison [11,12,13,36,39]. All computations were performed manually in AutoCAD on vector façade drawings rather than on raster images.

Vector drawings were first cleaned to remove redundant lines and to ensure a uniform line thickness before grid overlay. A square grid was then superimposed at six box sizes (7200, 3600, 1800, 900, 450, and 225 mm). These six sizes were grouped into three two-scale segments corresponding to FD1, FD2, and FD3 (coarse, intermediate, and fine scales). For each segment, two box sizes (s₁, s₂) were used (e.g., 7200 mm and 3600 mm), and the number of occupied boxes at each size (N_s₁, N_s₂) was counted manually.

FD for each segment was then computed from the two log–log points using base-10 logarithms as:

F D = \log (n s 2) - l o g (n s 1) \div (l o g (s 2) - l o g (s 1)

where (s) is the box size and (ns) is the occupied-box count at that scale. Because each FD estimate is based on only two box sizes, the regression line through these two points is exact (R² = 1.000), and 95% confidence intervals for the slope cannot be estimated (zero degrees of freedom).

Accordingly, throughout this paper FD1–FD3 are treated as fixed descriptive indices, conditional on the chosen two-scale grid and manual counting procedure; no standard errors, confidence intervals, or formal goodness-of-fit tests are associated with the FD slopes themselves. The three bands (FD1–FD3) were specified a priori, following Bovill’s box-counting scheme as refined by Ostwald, to approximate coarse, intermediate, and fine architectural scales; FD3, based on the smallest box sizes, was expected to be most sensitive to self-similar façade detail, and none of the bands were adjusted in response to the survey results. These three scale bands (FD1–FD3) were defined a priori on theoretical grounds and were not adjusted according to observed correlations or survey results; all are retained as descriptive indices to allow multi-scale comparison.

3.4. Visual-Equilibrium Computation (Larrosa)

Larrosa’s perceptual force model estimates a façade’s equilibrium as the algebraic combination of an ascending (upward-thrust) component, and a descending (visual-weight) component derived from shape proportions. To operationalize this on mosque elevations, we (i) partitioned each Level-3 elevation into geometrically coherent rectangular parts that reflect visible relations and compositional unity, (ii) measured each part’s vertical extent (a) and horizontal extent (b), and (iii) aggregated signed force contributions to a net visual weight VW, we computed:

Canonical formulation (Figure 3). For a rectangular part with vertical side

a

and horizontal side

b

, Larrosa defines a quadratic proportion

P

and two forces, ascending

∣ F_{p} ∣

and descending

∣ F_{c} ∣

, typically expressed as [2]:

P = \log (\frac{a^{2}}{b^{2}}) + 1 (W h e r e P i s t h e quadratic proportion o f t h e s h a p e) ∣ F_{p} ∣ = b \times P (ascending / upward thrust) ∣ F_{c} ∣ = \frac{a}{b} (descending / visual weight)

In this study, all dimensions “a” and “b” are measured in meters on scale-correct CAD elevations, and the logarithm in the proportion formula uses base-10. The derived quantities P, |Fp|, |Fc| and VW are treated as dimensionless indices. Component-level values reported are rounded to three decimal places, and the ‘Total visual weight’ in each table is obtained by algebraically summing the component VW values in the last column.

Figure 3. Force vectors and symbols for a rectangular part (

a, b, P, ∣ F_{p} ∣, ∣ F_{c} ∣

) used in Larrosa’s formulation. Notes for reporting: (i) state the logarithm base; (ii) define the measurement unit and image scale; (iii) clarify whether

a

denotes the horizontal extent and

b

the vertical extent across all parts; (iv) specify rounding/precision.

Figure 3. Force vectors and symbols for a rectangular part (

a, b, P, ∣ F_{p} ∣, ∣ F_{c} ∣

) used in Larrosa’s formulation. Notes for reporting: (i) state the logarithm base; (ii) define the measurement unit and image scale; (iii) clarify whether

a

denotes the horizontal extent and

b

the vertical extent across all parts; (iv) specify rounding/precision.

Aggregation over a façade (Figure 4). To estimate the façade’s resultant visual weight:

1.

Partition the elevation into geometrically unified parts (rectangles) that reflect visible relations and compositional logic.

2.

Measure

(a, b)

for each part and compute

P

,

∣ F_{p} ∣

,

∣ F_{c} ∣

.

3.

Apply sign convention: ascending components are summed as positive; descending as negative.

4.

Handle voids/openings: treat openings as opposite-signed contributions (subtract their ascending component; add their descending component) to reflect their lightening effect.

5.

Sum across all parts to obtain the net value

V W

. Interpret as:

○: $V W > 0$ : visually light, upward tendency;
○: $V W \approx 0$ : visual equilibrium;
○: $V W < 0$ : visually heavy, ground-piercing tendency.

Figure 4. Illustration of visual weight calculation method, drawn by the authors. Workflow for façade partitioning and force aggregation; sign convention and treatment of openings.

In this study, each Level-3 elevation was partitioned into rectangular parts according to a simple protocol: visually coherent architectural components (e.g., prayer-hall block, portico, podium, minarets, major cornices) were treated as single parts; openings and recesses were treated as voids with opposite-signed contributions; and homogeneous regions with similar proportions and function were kept grouped rather than arbitrarily subdivided. Segmentation was performed by the first author and reviewed by the second author to ensure consistent application of these rules across all façades. We did not, however, perform formal inter-operator agreement testing (e.g., blind repeat partitioning or ICC/Kappa statistics), so the VW estimates reported here should be interpreted as conditional on this specific operationalization of Larrosa’s method and as pilot-level, hypothesis-generating values pending multi-rater validation.

3.5. Visual-Preference Survey (External Validation)

The validation process involved a survey designed to compare calculated results to subjective impressions, testing the reliability of the quantitative data against human perception. The survey was created by the authors to directly inquire about participants’ visual impressions of the sample buildings. It was distributed both through an online link and directly shared with architects, architecture professors, and final-year architecture students via professional groups and personal contacts. The purpose and intent of the questions were clearly explained, with specific emphasis on accuracy for student participants. The survey thus acts as a perceptual anchor, providing an exploratory check on how fractal-dimension and visual-weight computations align with expert impressions within this sample. In this way, the human factor is built into the methodology rather than treated as an afterthought.

3.5.1. Survey Objectives

The survey was designed to achieve the following objectives:

To determine whether participants’ subjective impressions align with the calculated visual complexity (fractal dimension).
To examine whether computed visual complexity and equilibrium show convergent patterns with expert participants’ judgments of the same façades, as an external check on the computational methods
To establish the correlation between subjective and objective visual assessment approaches, thereby testing the reliability of the computational models.

3.5.2. Survey Structure

The survey consisted of three main sections:

Demographic Information—including participant country, gender, and professional background.
Visual Complexity—questions designed to assess subjective impressions of each sample’s visual complexity.
Visual Equilibrium—questions focused on subjective perceptions of visual balance for each sample.

3.5.3. Survey Design

The questions were structured as a direct inquiry into the subjective assessment of visual complexity and visual equilibrium for each of the mosque samples. To ensure clarity and improve validity, two measurement scales were used:

Multiple-choice questions, offering descriptive options.
Rating scales, allowing participants to score attributes on a numerical scale.

The goal was to validate the computed results for visual complexity and equilibrium by comparing them with user perception. Therefore, questions were limited to these two variables, intentionally avoiding broader subjective themes that could skew the focus or introduce bias. The survey was designed as a validation tool for the computational visual assessment methods.

3.5.4. Survey Implementation

The survey was conducted in Saudi Arabia and Egypt, to ensure a culturally relevant yet slightly varied assessment of the mosque architecture. It was administered and monitored by the authors over a two-month period (April–May 2024) and collected 114 valid responses from three groups:

21 architecture academics
47 professional architects
46 final-year architecture students

The data showed no outliers, and it was noted that participants had a clearer understanding of visual equilibrium than visual complexity, likely due to the more intuitive nature of balance compared to intricacy.

Survey Participant Selection Criteria

Participants were selected according to the following criteria:

Relevant Academic or Professional Background: Individuals with experience or education in architecture, visual design, or related disciplines, ensuring a foundational understanding of visual assessment.
Diverse Expertise Levels: A mix of academics, practitioners, and advanced students to represent a broad but relevant range of perspectives.
Cultural Context: Participants from Saudi Arabia and Egypt ensured familiarity with mosque architecture and allowed exploration of cross-cultural perceptual consistency.

3.5.5. Ethical Approval and Informed Consent

The survey collected anonymous professional opinions from adult volunteers in Saudi Arabia and Egypt about non-sensitive architectural façades. In accordance with the applicable institutional guidelines at the time of data collection, this type of minimal-risk, anonymous opinion survey of adults was exempt from formal ethics committee review. Participation was voluntary, and information on the study purpose, procedures, and the right to withdraw at any time was provided on the first page of the questionnaire. The form explicitly informed participants that no personal details (e.g., name or email address) would be collected automatically unless they chose to provide them. No identifying personal data were required, and responses were stored anonymously on a password-protected institutional drive and used solely for research purposes.

3.6. Mosque Selection Criteria

The mosques selected for this study are situated along the main roads of Riyadh, chosen for their impactful visual presence to city observers. Riyadh, characterized by its sprawling horizontal layout, necessitates navigation primarily through its main roads. Consequently, the visual perception of the city is predominantly shaped by these thoroughfares, with mosques along these routes forming a significant part of the city’s visual landscape.

Moreover, larger mosques are prioritized over smaller ones due to their size, which surpasses that of average buildings. Additionally, larger mosques often feature two minarets, contributing to their visual balance and prominence. Riyadh’s mosques exhibit a diverse range of architectural styles, and the selected mosques encompass this variety to explore the visual complexity inherent in each style and its contextual suitability. The selected mosques should meet the following criteria:

Location: Only mosques located along Riyadh’s primary roads were chosen, as these structures significantly contribute to the city’s visual identity due to their high visibility.
Scale and Prominence: Larger mosques were prioritized because of their greater visual impact and architectural detailing, often including dual minarets which contribute to visual symmetry and equilibrium.
Architectural Diversity: The selected samples represent a diverse range of architectural styles prevalent in Riyadh, ensuring that the study captures variations in visual complexity and equilibrium across different designs, including modern and traditional styles. This diverse range of visual style represents the visual characteristics of Riyadh.
Cultural and Contextual Significance: Each mosque plays a representative role in reflecting Riyadh’s traditional and modern architectural trends.

This selection strategy intentionally privileges large, highly visible mosques on primary roads and thus constitutes a non-probability, purposive sample. As such, it is suitable for controlled comparison of FD, visual equilibrium, and preference ratings across a coherent typology, but it does not support statistical generalization to all mosque types in Riyadh. We therefore interpret the results as indicative patterns for this specific class of prominent mosques, to be tested in future research using larger, stratified samples that include neighborhood-scale and historic mosques across multiple districts.

4. Results and Discussion

We assessed the selected mosque façades using three complementary lenses: fractal dimension (FD; a proxy for multi-scale visual complexity), Larrosa’s visual equilibrium (net visual weight VW), and a visual-preference survey (behavioral external criterion). We first report each measure, then examine their associations (FD—VW; VW—preference; FD—preference). Survey- and VW-based effects are reported with point estimates and, where estimable, 95% confidence intervals. Façade-level associations among FD, VW, and survey means (n = 5) are summarized using Spearman’s ρ, interpreted descriptively as exploratory effect sizes; we do not report p-values or adjust for multiple testing at this level. FD values are treated as façade-level indices without associated confidence intervals.

4.1. Fractal Dimension Calculation for the Mosque Facades

Each mosque was modeled from architectural drawings, and a Level-3 front elevation (massing and primary edges, without textures) was prepared in AutoCAD for box-counting. For FD calculation, we applied the manual CAD-based procedure described in Section 3.3. A square grid was superimposed at six box sizes (7200, 3600, 1800, 900, 450, and 225 mm), grouped into three two-scale segments corresponding to FD1 (coarse), FD2 (intermediate), and FD3 (fine) (Table 2). For each segment, the number of occupied boxes was counted manually at the two box sizes, and FD was computed from the two log–log points using the equation given in Section 3.3. Because each FD estimate is based on only two box sizes, the regression line through these points is exact (R² = 1.000) and 95% confidence intervals for the slope cannot be estimated. In the analyses that follow, we therefore treat FD as a deterministic façade-level summary and do not attach sampling-based uncertainty estimates to these values.

4.2. Fractal Dimension Results Interpretation

Applying the box-counting approach with six box sizes yielded three fractal-dimension (FD) estimates per façade. While smaller box sizes are often expected to produce more accurate results, this is not always the case. The most reliable estimate is typically the one that repeats across scales, forming a consistent pattern. In this paper, however, three visual-assessment approaches are used for verification; therefore, we retain all three FD results for correlation with visual equilibrium and visual preference. Visual-preference data can help compare FD estimates by highlighting the most correlated scale. The finest scale band (FD3) exhibits the strongest correspondence with perceived visual complexity in this sample and is therefore treated as the most informative FD descriptor for the present analysis, while FD1 and FD2 are retained as supplementary multi-scale descriptors.

Across the sample, mosques scored FD values between 1.2 and 1.89, indicating a wide range of visual complexity within Riyadh’s architectural character. Riyadh’s architecture generally tends toward simplicity, light colors, and minimal ornament. Relative to its predecessor—the Najdi style in the old center of Riyadh—medium to low FD values are expected. Ostwald classifies FD as low (≤1.5), exemplified by Le Corbusier’s work, and high (≥1.5), as in Frank Lloyd Wright’s architecture, establishing 1.5 as a balance point and arguing that higher FDs are closer to natural patterns [11]. Taylor identifies the 1.3–1.5 FD range as stress-reducing in architecture [40]. In this context, and consistent with repeated results in each case, the mosques in this study generally lean toward lower FD values (below 1.5), aligning with the above expectations.

By case: Al Babtain mosque showed consistent results (1.37–1.44) across the three iterations, indicating a stable self-similarity pattern. Abdulaziz Alfaris mosque dropped from ~1.6 to 1.2 at the third iteration, suggesting fewer close-scale details; façade curves likely elevated the first-iteration value. Al-Mohainy mosque increased from 1.26 (first) to 1.48 and 1.54 (second and third), revealing more small-scale detail, likely due to richer openings and minaret features. Awaidhah mosque decreased from the first to the second iteration and stabilized at 1.52 in the third; its slender design elevated the first-iteration value, whereas small openings reduced the second. King Abdullah mosque began high at 1.89 and decreased to 1.47–1.48 in the second and third iterations. Across mosques, the third iteration appears most reliable: the first and second iterations matched in two cases, while the second and third matched in three, indicating greater stability by the third iteration. Validation against the survey results is required to confirm this inference.

For the third iteration (FD3), visual-complexity values range from 1.203 (minimum) to 1.547 (maximum), forming a consistent pattern of low fractal-dimension complexity, generally below the 1.5 midpoint. This suggests that Riyadh’s mosques tend toward lower visual complexity overall. For the third iteration (FD3), visual-complexity values range from 1.203 to 1.547, forming a consistent pattern of low fractal-dimension complexity within this sample of large main-road mosques. This suggests that, among such visually prominent cases, mosque façades in Riyadh tend toward lower visual complexity overall.

4.3. Visual Equilibrium Calculations

Mosque façades were analyzed in AutoCAD based on their geometry and simplified into rectangular components, as illustrated in Figure 5, Figure 6, Figure 7, Figure 8 and Figure 9. Following Larrosa’s method, openings were calculated with reversed sign [2]. Repeated rectangular elements were summed to compile the calculation tables [Table 3, Table 4, Table 5, Table 6 and Table 7].

The façade partitioning method is based on the illustration in Figure 4. As established by Larrosa, ornaments and curved elements are reduced to simple rectangular boundary shapes, ignoring internal window patterns and treating windows as openings that reduce weight in opposition to solid elements and projections. Curved elements such as arches and circles are bounded by rectangles for calculation purposes, because Larrosa’s equation operates only on rectangular shapes.

The façade segmentation level used for VW adopts the same Level-3 segmentation as for FD (Section 3.2), to match the level of façade representation and accuracy and to make the results more comparable. Larrosa’s original method did not specify a calibrated façade-segmentation level, so this alignment ensures consistency between the FD and VW analyses.

Al Jawhara Al Babtain mosque:

Recesses and glazing are treated as voids relative to the solid stone elements projecting on the façade (see Figure 5); see detailed calculations in Table 3.

2.: Abdulaziz Alfaris mosque

The column capitals are treated as distinct elements forming a repeated pattern; each capital is calculated separately. The glazing behind is treated as openings. Proportions are visually dominated by the columns and the entablature (Figure 6). Detailed calculations are provided in Table 4.

3.: Abdullah Nasser Al-Mohainy mosque

The minarets subdivide the façade into smaller parts, affecting overall proportions. All doors and windows are treated as openings that interrupt the main volume (Figure 7). Detailed calculations are provided in Table 5.

4.: Awaidhah mosque

Details within openings are ignored—each opening is treated as a single unit to preserve the geometric unity of patterns. Visible volumes are the primary determinants of proportions. Because of the top continuous elements that ensure horizontal continuity, the main building is treated as one element (Figure 8). Detailed calculations are provided in Table 6.

5.: King Abdullah Bin Abdul Aziz mosque

The proportions resolve into three primary objects. The minaret is dominant and vertical, forming a single unit anchored to the ground. The main building is divided into two strongly horizontal elements, producing a substantially heavy visual weight. In this case, color contrast is the principal visual differentiator (Figure 9). Detailed calculations are provided in Table 7.

4.4. Interpretation of Visual-Equilibrium Results

The results range from +36 to −30, indicating substantial diversity in visual proportions across cases. For example, in Table 5 the component with a = 9 m and b = 20.8 m yields p = 0.272, |Fp| = 5.665 and |Fc| = 33.045; its visual-weight contribution (VW = −27.380) is obtained as the signed difference between the ascending and descending forces. The façade-level total VW = −30.481 is then calculated by summing all component VW entries in the table. A notable pattern emerges in the minaret proportions, which tend to approach visual equilibrium in all cases except King Abdullah mosque. From the first row of each table, the minaret visual weights are: −0.1 (Al Babtain), −1.3 (Alfaris), +0.9 (Al-Mohainy), and −0.8 (Awaidhah)—a consistent pattern of near-zero (i.e., visually balanced) values. King Abdullah mosque’s minaret, at −2.3, deviates modestly due to its greater height, indicating increased descending visual weight. Given the prominence of minarets in Riyadh’s architecture—and their proportions and forms that differ from other regions—it is noteworthy that, overall, they contribute positively to façade-level equilibrium.

Among the cases studied, Al Babtain mosque exhibits the strongest façade-level equilibrium and the most balanced minaret. It also represents a commonly repeated architectural style in Riyadh—two rectangular minarets with standard proportions—which achieves greater visual balance than designs with slimmer, taller, or single minarets. These findings suggest that visual equilibrium is a characteristic feature of Riyadh’s traditional mosque architecture.

4.5. Comparison of Fractal Dimension and Visual Weight

While FD3 shows the closest descriptive alignment with perceived visual complexity in this sample, this is treated as a within-sample tendency rather than a formal scale-selection criterion. [Table 8] presents a side-by-side comparison of each building’s fractal dimension (FD) and visual equilibrium scores. Figure 10 and Figure 11 plot these façade-level indices as point estimates: FD1–FD3 at the three scale bands and the single VW score per façade. As noted in Section 4, these computational measures are derived from a single box-counting procedure and a single documented partitioning scheme and are therefore treated as deterministic façade-level indices without associated confidence intervals.

Visual-weight results in Figure 11 are interpreted by the sign and magnitude of the value. A positive value indicates that the building appears visually lightweight, with an upward tendency. A negative value suggests greater visual weight, giving the impression that the building is grounded or “piercing the earth.” Values near zero signify that the building is visually balanced, i.e., in equilibrium. Sample 5 is perceived by respondents as the lightest, while Sample 3 is the heaviest; samples 1 and 3 are the most balanced in terms of VW, exhibiting values closest to visual equilibrium.

Looking at the fractal-dimension graph in Figure 10, FD1 differs from the other two estimates in three of the five cases, whereas FD2 and FD3 are similar in four of the five cases, indicating that FD2 and FD3 are more stable than FD1 across this sample. In the visual-equilibrium graph (Figure 11), a qualitative similarity is observed between FD1 and VW. The façade-level Spearman correlations between VW and the three FD bands are ρ ≈ 0.07 for FD3, ρ ≈ −0.30 for FD2, and ρ ≈ 0.70 for FD1. Any apparent advantage of one FD band over another in this five-façade sample is therefore interpreted as a descriptive tendency rather than a formal scale-selection rule. Given the small sample (n = 5), these coefficients are interpreted descriptively as effect sizes rather than as the basis for formal inference, but they suggest that the coarse-scale FD1 band aligns most closely with the part sizes used in the visual-equilibrium calculations.

Future work can examine this relationship by computing visual-equilibrium results at multiple shape-analysis scales. For example, if window patterns are considered and glass frames are counted as separate panels—matching the finer 225 mm box-counting scale—results may show scale-specific correlations between visual complexity and visual equilibrium. Comparison with the visual-preference survey will helps explore which fractal-complexity scale shows the clearest sample-specific correspondence with perceived complexity. More broadly, future studies should test the robustness of FD–preference and FD–VW patterns by experimenting with alternative box-size progressions, different FD band groupings, and FD estimates based on more than two box-counting points per band.

4.6. Visual Preference Survey Approach

If the mosque images in the survey were 2D elevation drawings identical to those used for computation, they would feel unrealistic and less relatable to respondents. Conversely, if we used photographs, they would not be directly comparable to the 2D elevations. Therefore, we adopted 2D rendered images as a compromise—striking a balance between realism and comparability. A similar image type was used in a visual-preference survey on Chinese courthouse architecture [41].

The visual-preference survey serves as an external criterion to explore how two computational approaches align with expert perception in this specific set of façades, rather than to establish definitive validation. The survey targeted architects, visual-design specialists, architecture academics, design enthusiasts, and architecture students, all evaluating the same five mosque façades in Riyadh that were analyzed computationally. The survey items and 0–10 scale anchors were drafted based on previous visual-preference studies in environmental and architectural design and were iteratively reviewed and refined by the authors to ensure clear definitions of ‘visual complexity’ and ‘visual balance’ and a transparent interpretation of the midpoint (5) as perceived equilibrium. In this way, the survey acts as a perceptual anchor, providing an exploratory check on how FD- and VW-based metrics align with expert impressions within this sample.

To maximize interpretability and reliability, we used two types of questions:

Guided multiple-choice items (definitions up front)
Participants first answered multiple-choice questions that clarified what we mean by visual complexity and visual equilibrium. These items both prime understanding and provide a course, categorical measure that can be correlated with the scale ratings [Table 9]:

Responses are counted and multiplied by the numerical scores above and later correlated with the scale ratings to assess response consistency. Presenting these definitions before the scale questions help respondents anchor their subsequent ratings. At the same time, we acknowledge that presenting these definitions and guided categories immediately before the rating tasks may also prime respondents toward the constructs used in the computational metrics; as a result, any survey–metric correspondences should be interpreted as convergent rather than fully independent validation.

2.

0–10 scale ratings (primary measures)

Participants then rated each façade on two 0–10 scales:

Visual complexity: 0 = very simple, 10 = highly detailed/complex, 5 = balanced.
Visual equilibrium (visual weight): 0 = very light/upward, 10 = very heavy/grounded, 5 = equilibrium.

These scale ratings are the primary survey outputs used to compare against fractal dimension and visual equilibrium calculations.

The full survey instrument, including instructions, rendered façade images, multiple-choice questions, and 0–10 rating scales, is provided as Supplementary Material S1.

4.6.1. Results Interpretation

To check internal consistency, we computed Pearson’s correlation coefficient between the two question types (multiple-choice and 0–10 rating scales) for each construct. The internal correlation between the two survey formats for equilibrium is very high (r ≈ 0.99, n = 5, p < 0.01), providing a descriptive indication of excellent consistency in how respondents applied the balance concept, rather than a formal hypothesis test. We then correlated the survey outputs with the computational results to address the study aims. For the façade-level associations among FD, VW, and façade-level mean ratings (n = 5), we report Spearman’s ρ as descriptive effect sizes without p-values or multiple-comparison corrections, given the very small sample size.

The average results from the scale questions were correlated with the numerical scores from the multiple-choice. A higher correlation is interpreted as survey accuracy, while lower correlation is interpreted as survey respondents’ confusion.

As a further accuracy check, we correlated the mean rating-scale scores with the numerically coded multiple-choice counts. A higher correlation indicates clearer respondent understanding (greater survey accuracy), whereas a lower correlation suggests possible confusion or construct misinterpretation.

Visual complexity. For comparability with fractal dimension (FD) values, rating-scale results were linearly mapped to the FD range [1,2]. Specifically, the average scale score

(x) was transformed as:

M a p p e d c o m p l e x i t y = 0.1 \times (x) + 1

Thus, a scale score of 10 becomes 2.0 and a score of 0 becomes 1.0, aligning the survey outputs with the conventional FD range [11]. Because this is a simple linear, monotone transformation of the 0–10 ratings, it preserves the rank ordering of façades and therefore does not change any Spearman correlations; it is used solely to place the survey-based curve on the same vertical axis as the FD values for visualization. The mapped survey values were then correlated with FD1, FD2, and FD3. Given the small n and exploratory nature of the study, weak correlations are interpreted simply as weak associations, without attributing error specifically to either the survey or the FD measure.

Visual equilibrium: Larrosa’s logarithmic formulation treats the principal/ascending force as positive and the complementary/descending (weight) as negative; hence, negative sums indicate heaviness, and positive sums indicate lightness. In the survey, however, the 0–10 scale was anchored so that higher numbers = heavier appearance, in order to avoid confusion about the sign convention. To align the survey outputs with the computational metric and center the scale at equilibrium, we applied two adjustments: (i) recenter at 5 (equilibrium midpoint) and (ii) flip the sign so that heavier becomes more negative. Empirically, computed VW values range approximately from −50 to +50, so we also scaled the survey scores to this range. For a rating-scale score (y), the mapping is:

M a p p e d e q u i l i b r i u m = (y - 5) \times (- 50)

This transformation yields a survey-based equilibrium metric directly comparable to the objective VW, This linear, monotone transformation yields a survey-based equilibrium metric directly comparable to the objective VW, without altering the rank ordering of façades or changing which samples are perceived as balanced, heavy, or light; in particular, it does not affect the Spearman correlations and serves solely to place the survey scores on a comparable axis for visual comparison with the VW calculations. As a simple sensitivity check, replacing the multiplier −50 with −1 produces the same Spearman correlation (ρ ≈ 0.85, n = 5) and identical façade ranking, confirming that this factor only rescales the vertical axis.

4.6.2. Visual-Complexity Survey Results

Visual-complexity survey outcomes are summarized in Table 10 and correlated with the computed fractal-dimension values in Table 11. FD3 shows a weak-to-moderate positive association with perceived complexity (r ≈ 0.43, n = 5), which we treat as an exploratory pattern within this small sample. FD1 and FD2 correlations with perceived complexity are close to zero, indicating little evidence of alignment at the coarser scale bands in this sample. The relationships between the mapped survey scores and FD1/FD2/FD3 are shown in Figure 12. The highest association is observed for FD3, as highlighted in Figure 13, indicating that the finest scale band (FD3) aligns most closely with participants perceived complexity.

When interpreting Figure 13, FD3 exhibits the strongest correspondence with subjective impressions. Nonetheless, Samples 2 and 5 show notable divergences. In Sample 2, arches lower the calculated FD at coarse scales, yet respondents judged the façade as visually complex—likely due to repetition of arches and perceived construction intricacy. In Sample 5, the façade’s ornamentation yields a higher FD, but many respondents perceived the overall composition as simple, possibly because its rectilinear order and traditional layout cue simplicity despite local detail.

Key observations:

Visual complexity: perception vs. measurement: The discrepancies between survey responses and calculated fractal dimensions (FD) suggest that respondents may interpret “complexity” differently from how FD operationalizes it (edge density across scales). To reduce ambiguity, provide a brief primer with one visual example (low-detail planar façade vs. fine-grained patterned façade) and a single comprehension check before the survey. This clarifies that FD reflects multi-scale edge structure rather than ornament meaning or style.
Factors influencing survey responses: The mention of confusion or lack of visual capability among respondents underscores the multifaceted nature of factors influencing perceptions. Age, cultural background, and expertise in art or architecture are all potential variables shaping individuals’ evaluations of visual complexity. Additionally, environmental factors such as lighting conditions or presentation format could impact responses. For instance, viewing images on a computer screen versus in-person may lead to varying perceptions, highlighting the need for controlled experimental conditions in future studies.
Role of fractal dimension in architectural analysis: The weak-to-moderate positive association observed between FD3, and survey results suggests the potential utility of FD analysis as a tool for understanding human perception of complexity in architectural forms. This finding implies that certain architectural features or design principles may influence fractal dimension calculations and their relationship to human perception. Further research could explore these dynamics in greater detail to elucidate the underlying mechanisms driving perceptions of complexity.
Interpreting discrepancies in specific cases: The notable discrepancy observed in the perception of the Alfaris mosque highlights the intricate nature of architectural perception. Unique architectural features may confound traditional measures of complexity, necessitating nuanced interpretation. Qualitative methods such as interviews or focus groups could provide valuable insights into the specific aspects of design that influenced participants’ perceptions. Understanding these discrepancies is crucial for refining measurement techniques and enhancing the accuracy of architectural assessments.
Implications for design practice: Understanding how people perceive complexity in architecture is vital for designers seeking to create aesthetically pleasing architecture. The findings of this study have significant implications for design decisions aimed at balancing complexity and simplicity to meet users’ preferences and needs. Designers should consider not only objective measures of complexity but also subjective perceptions when creating architectural designs. Iterative testing and feedback could be valuable for refining designs based on user preferences and enhancing overall user experience.

4.6.3. Visual-Equilibrium Survey Results

Survey outcomes for visual equilibrium are summarized in Table 12, and their associations with the computed visual weight (VW) are reported in Table 13. The paired trends are plotted in Figure 14, showing a close correspondence between subjective (mapped survey) and objective (computed VW) measures. The high descriptive correspondence (Spearman’s ρ ≈ 0.85, n = 5) provides preliminary convergent evidence that Larrosa’s visual-weight estimates track expert judgments of balance for this sample of mosque façades.”. Given the convenience, expert sampling and the conceptual definitions provided to participants, this evidence should be interpreted as initial rather than definitive validation, and further work is needed with more diverse participant groups and tasks that minimize priming.

Numerically, the correlation between the mapped survey scale and computed VW is Spearman’s ρ ≈ 0.85 (n = 5), indicating high descriptive correspondence within this small sample. Given the limited n, this result should be interpreted as exploratory, not as definitive statistical validation. The correlation using the numerically coded multiple-choice responses is r = −0.791; this negative sign is expected because the multiple-choice coding and the VW sign convention are inverted (heavier → larger positive code in the survey, whereas VW treats heaviness as more negative). Taken together, these results provide preliminary convergent evidence for the Larrosa visual-equilibrium calculation in this specific context.

The visual-preference data were obtained from a convenience sample of architecture academics, professional architects, and final-year architecture students, recruited through professional networks. This expert composition and the provision of conceptual definitions for ‘visual complexity’ and ‘visual equilibrium’ (Table 9) may increase alignment between subjective ratings and the theoretical constructs underlying Larrosa’s model. As a result, the observed high correlations between computed visual weight and survey-based balance should be regarded as upper-bound, expert-sample estimates of convergent validity, not as definitive proof that Larrosa’s method captures balance perceptions in the general population. Future research should replicate the study with lay participants and alternative tasks (e.g., pairwise comparisons without explicit definitions) to test generalizability and to further probe the robustness of the model.

Interpreting Figure 14. The overall pattern supports the calculation method, with one notable divergence at Sample 4. Respondents perceived Sample 4 as heavier than the computed VW suggests. This is plausibly due to the façade’s slim vertical openings, which can visually signal heaviness to observers; in the computation, however, those openings contribute opposite-signed (lightening) terms that partially offset the dominant horizontal elements, yielding a value nearer to equilibrium. Aside from this case, the trajectories of the two curves are consistent. Overall, the correlation between subjective evaluations and the calculated results affirms both the reliability of human perception in assessing visual equilibrium and the accuracy of Larrosa’s visual weight calculation method.

Key observations:

Internal survey consistency: The strong within-survey agreement (rating scale vs. multiple-choice; see Table 12 and Table 13) indicates that respondents applied the visual-weight concept consistently across question formats. This suggests that the survey effectively captured participants’ perceptions of visual weight, providing valuable insights into their assessments of the prominence of architectural elements within surveyed structures.
Objective–subjective correspondence: The high descriptive correspondence between computed VW and mapped survey results (Spearman’s ρ ≈ 0.85, n = 5) provides preliminary convergent evidence that Larrosa’s method can act as a quantitative proxy for perceived balance in this sample of façades. As a simple sensitivity check, we recomputed this association using the mapping (y − 5) × (−1) instead of (y − 5) × (−50); the Spearman correlation and the ranking of façades were identical, indicating that the scaling factor only rescales the plotted values and does not influence the rank-based correspondence. In line with this, we treat the mapped survey scores as a visualization aid that place subjective and computed indices on comparable axes, rather than as an additional source of statistical evidence.
Case-level nuance (Al Babtain vs. Awaidhah): In Al Babtain, perceived equilibrium closely matches the computation (near zero), illustrating strong convergence. In Awaidhah, the perceived heaviness exceeds the computed value—likely reflecting how observers weigh continuous horizontal bands and narrow vertical perforations.
Perceptual emphasis differs from complexity: Several respondents appeared more certain about either complexity or weight for a given façade. This suggests that complexity (detail across scales) and equilibrium (up–down balance) tap distinct judgments. Designers and researchers should treat them as complementary, not interchangeable, constructs.

In conclusion, the strong agreement between computed visual weight and surveyed visual equilibrium—with a single, interpretable outlier—may provide a useful quantitative proxy for perceived balance in this specific context. Incorporating confidence intervals, multi-operator partitioning and inter-rater agreement tests, part-level sensitivity checks, and case notes (as above) will further strengthen the evidential base for the method’s validity.

5. Conclusions

This study triangulated fractal dimension (FD), visual equilibrium (Larrosa’s visual weight, VW), and a visual-preference survey to characterize the visual qualities of a small set of Riyadh mosque façades. Across the five cases, FD values for the third (finest) iteration (FD3) fell between 1.203 and 1.547, indicating generally low visual complexity within this sample. Most samples lie near the 1.3–1.5 band often associated in the literature with calming, “stress-releasing” patterns, which is consistent with restrained, Najdi-influenced vocabulary of these large main-road mosques.

Within this limited set of façades, the high descriptive alignment between computed VW and survey-mapped equilibrium (Spearman’s ρ ≈ 0.85, n = 5) provides preliminary convergent-validity evidence that Larrosa’s method can act as a quantitative proxy for perceived balance in this context. Minarets were typically near-balanced elements; paired, separate minarets tended to help maintain whole-façade equilibrium, especially when counterweighted by the broader prayer-hall mass. In the Al Babtain example, equilibrium was close to zero with slight lightness, consistent with its canonical Najdi proportions as perceived by the expert sample.

For visual complexity, correlations with subjective judgments were weak to moderate overall, but highest for FD3 (≈0.40), suggesting that finer-scale detail (openings, frames) is most salient to observers. Divergences in specific cases (e.g., arches perceived as complex despite moderate FD; ornamented yet “simple” façades due to strong global order) indicate that repetition and overall organization modulate perceived complexity beyond what FD at a given scale capture.

As a result, patterns such as lower visual complexity and the relative balance of the traditional twin-minaret façades in this sample are best interpreted as within-sample tendencies and testable hypotheses, not universal properties of mosque architecture. A tentative rule of thumb, for façades of comparable style and composition, is that targeting FD ≈ 1.3–1.5 with VW ≈ 0 yields visually calm yet balanced compositions; moving toward FD ≈ 1.5–1.6 may increase perceived liveliness provided that equilibrium is maintained.

These conclusions should be viewed in light of two main limitations: (i) the small, purposive set of five large main-road mosques in a single city, and (ii) the expert convenience sample (architects, academics, and advanced students), which likely provides upper-bound alignment between computational metrics and perception. As a result, patterns such as lower visual complexity and the near-equilibrium balance observed in some twin -minaret façades are best interpreted as within-sample tendencies and testable hypotheses, not universal properties of mosque architecture or of twin-minaret designs in Riyadh.

Future work should (i) where sample sizes allow, report confidence intervals with all effect sizes, (ii) explore scale-matched equilibrium (partitioning at finer part sizes to test FD1/FD2/FD3 against corresponding VW), (iii) add feature-specific FD (e.g., windows-only) and simple proxies for repetition and global order, and (iv) broaden to neighborhood mosques, historic fabric, and other Saudi and non-Saudi cities, including lay participants, to test cross-typology and cross-cultural generalizability. Because Larrosa-based equilibrium values in this study depend on a single documented partitioning protocol implemented by one primary coder, the VW findings should be regarded as preliminary and method-development oriented rather than as fully validated across alternative segmentation choices. A priority for future work is to repeat the Larrosa calculations with multiple independent partitioners, quantify inter-operator agreement, and test the sensitivity of VW outcomes to alternative, yet reasonable, façade segmentation strategies. In addition, subsequent studies should test whether the observed FD–preference and VW–survey correspondences remain stable under alternative (including non-linear) mapping schemes for the rating scales to confirm that these patterns are not artifacts of rescaling. Short, guided primers in surveys may also help reduce ambiguity around “visual complexity” and “visual balance.” This triangulated framework is designed to be adaptable to other building types and urban contexts. In all such applications, FD and VW are intended to serve as transparent, discussable indicators that support design and review conversations, while leaving final judgments about visual quality with human decision-makers.

Supplementary Materials

The following supporting information can be downloaded at Survey link: https://forms.office.com/r/uXXDNvDJP4 (accessed on 4 May 2024).

Author Contributions

Each author has made substantial contributions to the conception and design of the work. M.A.A. and E.M.M.S. contributed to the ideas and development. Both M.A.A. and E.M.M.S. were involved in the design of the methodology. They also both participated in writing the original draft preparation. E.M.M.S. performed the data analysis and interpretations. M.A.A. and E.M.M.S. modified the detailed descriptive and deductive approaches. M.A.A. contributed to the data resources. M.A.A. revised the manuscript. M.A.A. and E.M.M.S. handled data curation, writing, and editing. M.A.A. was responsible for funding acquisition and project administration. E.M.M.S. handled the software. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University (IMSIU) (grant number IMSIU-DDRSP2602).

Data Availability Statement

The raw data supporting the conclusions of this article will be made available by the authors on request.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Ostwald, M.J.; Tucker, C.; Chalup, S. Line Segmentation: A Computational Technique for Architectural Image Analysis. In ACADIA 09: reForm(): Building a Better Tomorrow—Proceedings of the 29th Annual Conference of the Association for Computer Aided Design in Architecture, Singapore, 20–26 April 2024; pp. 153–158. [Google Scholar] [CrossRef]
Larrosa, M. Un Modelo de La Forma Visual de La Arquitectura: Un Ensayo Sobre La Geometría de La Percepción, Spanish ed.; Editorial Académica Española: London, UK, 2013; ISBN 9783659060137. [Google Scholar]
Vaughan, J.; Ostwald, M.J. Refining a Computational Fractal Method of Analysis: Testing Bovill’s Architectural Data. In New Frontiers—Proceedings of the 15th International Conference on Computer-Aided Architectural Design in Asia, CAADRIA 2010, Sha Tin, Hong Kong, 7–10 April 2010; pp. 29–38. [Google Scholar] [CrossRef]
Arnheim, R. The Dynamics of Architectural Form; University of California Press: Oakland, CA, USA, 1977; ISBN 0-520-03305-1. [Google Scholar]
Alobailan, L.; Alawad, A. Architectural Values behind the Formation of Heritage Houses as a Tool for Promoting Cultural Identity in Saudi Arabia. WIT Trans. Built Environ. 2022, 211, 29–50. [Google Scholar]
Alnaim, M.M. Traditional Najdi Settlement Architectural Elements: Harmonizing Function, Aesthetics, and Shared Socio-Cultural Meaning. J. Archit. Plan. 2021, 33, 261–276. [Google Scholar]
Moscatelli, M. Rethinking the Heritage through a Modern and Contemporary Reinterpretation of Traditional Najd Architecture, Cultural Continuity in Riyadh. Buildings 2023, 13, 1471. [Google Scholar] [CrossRef]
Baik, A. Najdi Architecture in HBIM for Sustainable Conservation. J. Umm Al-Qura Univ. Eng. Archit. 2025, 1–21. [Google Scholar] [CrossRef]
Alnaim, M.M. The Architecture of Mosque Integration of Decoration, Functionality, and Spirituality: An Overview of Najd Region Mosque Architecture. J. Eng. Res. 2023, 1–15. [Google Scholar]
McManus, I.C.; Stöver, K.; Kim, D. Arnheim’s Gestalt Theory of Visual Balance: Examining the Compositional Structure of Art Photographs and Abstract Images. Iperception 2011, 2, 615–647. [Google Scholar] [CrossRef]
Ostwald, M.J.; Tucker, C. Calculating Characteristic Visual Complexity in the Built Environment: An Analysis of Bovill’s Method. In Proceedings of the Symposium: Building Across Borders Built Environment Procurement CIB WO92 Procurement Systems, Hunter Valley, Australia, 23–26 September 2007; Centre for Interdisciplinary Built Environment Research (CIBER), University of Newcastle: Callaghan, Australia, 2007. [Google Scholar]
Ostwald, M.J. The Fractal Analysis of Architecture: Calibrating the Box-Counting Method Using Scaling Coeffi Cient and Grid Disposition Variables. Environ. Plan. B Plan. Des. 2013, 40, 644–663. [Google Scholar] [CrossRef]
Zhang, C.; Ping, X.; Fan, Q.; Li, C. Measurement of 2D and 3D Fractal Features of Urban Morphology from an Architectural View and Its Influencing Factors. Fractal Fract. 2024, 8, 138. [Google Scholar] [CrossRef]
Hussein, D. A User Preference Modelling Method for the Assessment of Visual Complexity in Building Façade. Smart Sustain. Built Environ. 2020, 9, 483–501. [Google Scholar] [CrossRef]
Abboushi, B.; Elzeyadi, I.; Taylor, R.; Sereno, M. Fractals in Architecture: The Visual Interest, Preference, and Mood Response to Projected Fractal Light Patterns in Interior Spaces. J. Environ. Psychol. 2019, 61, 57–70. [Google Scholar] [CrossRef]
Raeisian, G.; Badreh, M. The Role of Mosques in Urban Development. J. Civ. Eng. Urban. 2013, 3, 101–103. [Google Scholar]
Saniei, M.; Delavar, A. Communicational Role of Mosques Architecture. Asian Soc. Sci. 2012, 8, 137–141. [Google Scholar] [CrossRef]
Habibabad, A.S.; Matracchi, P.; Sadeghi Habibabad, A. A Review of Approaches and Methods for Assessing Sensory Factors in Architectural Environments (Spiritual Experiences in Religious Architecture). Int. J. Archit. Eng. Urban Plan. 2021, 31, 2021. [Google Scholar] [CrossRef]
Kant, I. Critique of Judgment; Werner, S., Pluhar, T., Eds.; Hackett Publishing Company, Inc.: Indianapolis, IN, USA, 1987. [Google Scholar]
Burks, E. A Philosophical Inquiry into the Origin of Our Ideas of the SUBLIME and BEAUTIFUL: Intoductory Discourse, Concerning Taste and Several Other Additions; M’lean, H.L.T., Ed.; Hewlett and Briramer: London, UK, 1823. [Google Scholar]
Robinson, B.N. Sight. Is Our Most Dominant Sense. If It Is Not the Dominant Sense or the Leader of the Other Senses, a Reading Problem Is Often the Result. We Live Primarily in a Visual World. Doc. Pesumk 1975, 137. [Google Scholar]
Lieberman, L.M. Visual Perception versus Visual Function. J. Learn. Disabil. 1984, 17, 182–185. [Google Scholar] [CrossRef]
Arnheim, R. Rudolf Arnheim: Revealing Vision; University of Michigan Press: Ann Arbor, MI, USA, 1997. [Google Scholar]
Chalup, S.K.; Clement, R.; Marshall, J.; Tucker, C.; Ostwald, M.J. Representations of Streetscape Perceptions through Manifold Learning in the Space of Hough Arrays. In Proceedings of the 2007 IEEE Symposium on Artificial Life, CI-Alife, Honolulu, HI, USA, 1–5 April 2007; IEEE: New York, NY, USA, 2007; pp. 362–369. [Google Scholar] [CrossRef]
Martinović, A.; Mathias, M.; Weissenberg, J.; Van, L. A Three-Layered Approach to Facade Parsing—Supplementary Material. In Computer Vision—ECCV 2012; Springer: Berlin/Heidelberg, Germany, 2012; pp. 1–8. [Google Scholar]
Gadde, R.; Marlet, R.; Paragios, N. Learning Grammars for Architecture-Specific Facade Parsing. Int. J. Comput. Vis. 2016, 117, 290–316. [Google Scholar] [CrossRef]
Chang, D.; Park, J. Quantifying the Visual Experience of Three-Dimensional Built Environments. J. Asian Archit. Build. Eng. 2018, 17, 117–124. [Google Scholar] [CrossRef]
Liu, H.; Zhang, J.; Zhu, J.; Hoi, S.C.H. Deepfacade: A Deep Learning Approach to Facade Parsing. In Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, Melbourne, Australia, 19–25 August 2017; IJCAI: San Francisco, CA, USA, 2017. [Google Scholar]
Zhang, G.; Pan, Y.; Zhang, L. Deep Learning for Detecting Building Façade Elements from Images Considering Prior Knowledge. Autom. Constr. 2022, 133, 104016. [Google Scholar] [CrossRef]
Zhang, J.; Fukuda, T.; Yabuki, N. Development of a City-Scale Approach for Façade Color Measurement with Building Functional Classification Using Deep Learning and Street View Images. ISPRS Int. J. Geoinf. 2021, 10, 551. [Google Scholar] [CrossRef]
Starzyńska-Grześ, M.B.; Roussel, R.; Jacoby, S.; Asadipour, A. Computer Vision-Based Analysis of Buildings and Built Environments: A Systematic Review of Current Approaches. ACM Comput. Surv. 2023, 55, 1–25. [Google Scholar] [CrossRef]
Mandelbrot, B. Fractals; Freeman: San Francisco, CA, USA, 1977. [Google Scholar]
Mandelbrot, B.B. The Fractal Geometry of Nature; WH Freeman: New York, NY, USA, 1982; Volume 1, pp. 25–74. [Google Scholar]
Eglash, R. African Fractals. Modern Computing and Indigenous Design; Rutgers University Press: New Brunswick, NJ, USA, 1999; ISBN 9780813526140, 0813526140. [Google Scholar]
Zhang, Q.; Liu, T.; Zhang, Z.; Huangfu, Z.; Li, Q.; An, Z. Compaction Quality Assessment of Rockfill Materials Using Roller-Integrated Acoustic Wave Detection Technique. Autom. Constr. 2019, 97, 110–121. [Google Scholar] [CrossRef]
Aisyah, K.S.; Noerwasito, V.T.; Novianto, D. Implementing Fractal to Define Balinese Traditional Architectural Facade Beauty: The Kori Agung. Dimens. (J. Archit. Built Environ.) 2023, 50, 111–126. [Google Scholar] [CrossRef]
Jennath, K.A.; Nidhish, P.J. Aesthetic Judgement and Visual Impact of Architectural Forms: A Study of Library Buildings. Procedia Technol. 2016, 24, 1808–1818. [Google Scholar] [CrossRef]
Ostwald, M.J.; Vaughan, J. Significant Lines: Measuring and Representing Architecture for Computational Analysis. In Proceedings of the 46th Annual Conference of the Architectural Science Association, Gold Coast, Australia, 14–16 November 2012; p. 8. [Google Scholar]
Bovill, C. Fractal Geometry in Architecture and Design; Springer Nature: London, UK, 1996. [Google Scholar] [CrossRef]
Taylor, R.P. Reduction of Physiological Stress Using Fractal Art and Architecture. leonardo 2006, 39, 245–251. [Google Scholar] [CrossRef]
Pan, J.; Yuan, Y.; Wang, X.; Han, C. Research on Visual Preference of Chinese Courthouse Architecture Appearance. Buildings 2022, 12, 557. [Google Scholar] [CrossRef]

Figure 1. Box-counting approach applied to one of the study mosque façades using three grid sizes: (a) coarse grid (FD1), (b) intermediate grid (FD2), and (c) fine grid (FD3). Occupied boxes (containing façade edges) are counted at each scale to derive fractal-dimension values for the three scale bands.

Figure 2. Visual effect of column proportions on equilibrium (author’s illustration after Larrosa’s visual-force concept).

Figure 5. Al Jawhara Al Babtain mosque façade geometric analysis for visual weight calculations.

Figure 6. Abdulaziz Alfaris Mosque façade geometric analysis for visual weight calculations.

Figure 7. Abdullah Nasser Al-Mohainy mosque façade geometric analysis for visual weight calculations.

Figure 8. Awaidhah mosque façade geometric analysis for visual weight calculations.

Figure 9. King Abdullah bin Abdul Aziz mosque façade geometric analysis for visual weight calculations.

Figure 10. Façade-level fractal-dimension estimates for the five mosques at the three scale bands: FD1 (coarse), FD2 (intermediate), and FD3 (fine). Values are deterministic façade-level indices derived from box counting and are therefore shown as point estimates without confidence intervals.

Figure 11. Computed visual-weight (VW) scores for the five mosque façades based on Larrosa’s method. As with FD, these are deterministic façade-level indices obtained from a single documented partitioning scheme, and are therefore presented as point estimates without sampling-based confidence intervals.

Figure 12. Visual complexity survey and fractal dimension curves.

Figure 13. Visual complexity mapped survey and fractal dimension FD3 result curves.

Figure 14. Visual weight mapped survey and visual weight calculations curves.

Table 1. Mosques Buildings list (author).

Image	Digital Representation 3D	Elevation
Mosque Name: Al Jawhara Al Babtain Mosque, Location link: https://maps.app.goo.gl/PexBLUief55P1msH8, (accessed on 4 August 2025).

Mosque Name: Abdulaziz Alfaris Mosque, Location link: https://maps.app.goo.gl/7RYEAqC4gnxESbqg7, (accessed on 4 August 2025).

Mosque Name: Abdullah Nasser Al-Mohainy Mosque, Location link: https://maps.app.goo.gl/rLtkws3G43fAZ4WGA, (accessed on 4 August 2025).

Mosque Name: Awaidhah Mosque, Location link: https://maps.app.goo.gl/a9jxKQEaoo2Kjp7H6, (accessed on 4August 2025).

Mosque Name: King Abdullah Bin Abdul Aziz Mosque, Location link: https://maps.app.goo.gl/umEAfJPGmVCyPtBG8, (accessed on 4 August 2025).

Table 2. Fractal dimension calculations.

Al Jawhara Al Babtain Mosque
iteration	FD1	FD2	FD3
grid size	7200 mm	1800 mm	450 mm
box count
grid size	3600 mm	900 mm	225 mm
box count
N1	23	159	1077
N2	62	433	2792
FD	1.431	1.445	1.374
2. Abdulaziz Alfaris mosque
iteration	FD1	FD2	FD3
grid size	7200 mm	1800 mm	450 mm
box count
grid size	3600 mm	900 mm	225 mm
box count
N1	30	263	2067
N2	92	828	4759
FD	1.617	1.655	1.203
3. Abdullah Nasser Al-Mohainy mosque
iteration	FD1	FD2	FD3
grid size	7200 mm	1800 mm	450 mm
box count
grid size	3600 mm	900 mm	225 mm
box count
N1	20	148	1166
N2	48	413	3406
FD	1.263	1.481	1.547
4. Awaidhah mosque
iteration	FD1	FD2	FD3
grid size	7200 mm	1800 mm	450 mm
box count
grid size	3600 mm	900 mm	225 mm
box count
N1	18	165	1036
N2	58	457	2986
FD	1.688	1.47	1.527
5. King Abdullah Bin Abdul Aziz mosque
iteration	FD1	FD2	FD3
grid size	7200 mm	1800 mm	450 mm
box count
grid size	3600 mm	900 mm	225 mm
box count
N1	18	170	1178
N2	67	473	3285
FD	1.896	1.476	1.48

Table 3. Al Jawhara Al Babtain mosque visual weight calculations.

Building Name	a (Height)	b (Width)	Quadratic Proportions	Ascending Force	Descending Force	Visual Weight
Al Jawhara Al Babtian mosque	23.8	3	2.799	8.397	8.503	−0.107
	23.8	3	2.799	8.397	8.503	−0.107
	9	6.3	1.310	8.252	6.871	1.381
	9	2	2.306	4.613	3.902	0.711
	9	7.3	1.182	8.627	7.615	1.012
	9	6.4	1.296	8.295	6.944	1.351
	9	2	2.306	4.613	3.902	0.711
	9	7.4	1.170	8.658	7.692	0.966
	2.4	2	1.158	2.317	2.072	−0.245
	5	2	1.796	3.592	2.784	−0.808
	7	2	2.088	4.176	3.352	−0.824
	36	2.8	3.218	9.011	11.186	2.175
	3	2	1.352	2.704	2.219	−0.486
	3	2	1.352	2.704	2.219	−0.486
	9	3.8	1.749	6.646	5.146	−1.500
	Total visual weight					3.745

Note: a, b in m; forces and VW dimensionless, computed as explained in Section 3.4.

Table 4. Abdulaziz Alfaris mosque visual weight calculations.

Building Name	a (Height)	b (Width)	Quadratic Proportions	Ascending Force	Descending Force	Visual Weight
Abdulaziz Alfaris mosque	24.2	2	3.166	6.331	7.645	−1.314
	24.2	2	3.166	6.331	7.645	−1.314
	37.1	58.1	0.610	35.464	60.780	−25.316
	50.4	4.2	3.158	13.265	15.958	−2.693
	50.4	56	0.908	50.875	55.477	4.602
	22.9	0.4	4.516	1.806	5.071	3.265
	22.9	0.4	4.516	1.806	5.071	3.265
	Total visual weight					−19.504

Table 5. Abdullah Nasser Al-Mohainy mosque visual weight calculations.

Building Name	a (Height)	b (Width)	Quadratic Proportions	Ascending Force	Descending Force	Visual Weight
Abdullah Nassir Al-Mohainy mosque	16.9	3.2	2.445	7.826	6.911	0.915
	16.9	3.2	2.445	7.826	6.911	0.915
	10	2	2.398	4.796	4.170	0.626
	10	0.4	3.796	1.518	2.634	−1.116
	9	5.5	1.428	7.853	6.304	1.549
	9	20.8	0.272	5.665	33.045	−27.380
	9	5.3	1.460	7.738	6.165	1.573
	35	17.5	1.602	28.036	21.847	−6.189
	4	1.8	1.694	3.048	2.362	−0.687
	4	1.8	1.694	3.048	2.362	−0.687
	Total visual weight					−30.481

Table 6. Awaidhah mosque visual weight calculations.

Building Name	a (Height)	b (Width)	Quadratic Proportions	Ascending Force	Descending Force	Visual Weight
Awaidhah mosque	21	2	3.042	6.085	6.902	−0.818
	5	2.3	1.674	3.851	2.986	0.865
	9	1.5	2.556	3.834	3.521	0.314
	7.4	44.9	−0.566	25.415	13.074	12.341
	5.8	8	0.721	5.765	8.048	−2.283
	1.4	0.6	1.736	1.042	0.806	−0.235
	1	0.7	1.310	0.917	0.763	−0.153
	19.7	0.5	4.191	2.095	4.701	2.605
	30	1.62	3.535	5.727	8.486	2.759
	5.6	13	0.268	3.490	20.857	17.367
	1.5	9.5	−0.603	5.731	2.486	−3.245
	0.5	19.5	−2.182	42.552	0.229	−42.322
	Total visual weight					−12.804

Table 7. King Abdullah Bin Abdul Aziz mosque visual weight calculations.

Building Name	a (Height)	b (Width)	Quadratic Proportions	Ascending Force	Descending Force	Visual Weight
King Abdullah Bin Abdul Aziz mosque	36.7	3	3.175	9.525	11.559	−2.033
	8	40.2	−0.402	16.171	19.887	−3.716
	4	39.2	−0.982	38.512	4.071	34.441
	3	2	1.352	2.704	2.219	−0.486
	8.9	1.6	2.491	3.985	3.574	−0.411
	7.9	14	0.503	7.042	15.706	8.664
	Total visual weight					36.458

Table 8. Comparing visual weight and fractal dimensions results.

Results	Al Babtain	Alfaris	Al-Mohainy	Awaidhah	King Abdullah
FD1	1.431	1.617	1.263	1.688	1.896
FD2	1.445	1.655	1.481	1.470	1.476
FD3	1.374	1.203	1.547	1.527	1.480
Visual Weight	3.745	−19.504	−30.481	−12.804	36.458

Table 9. Visual-preference multiple-choice options and scoring.

Visual Complexity Choice	Visual Equilibrium Choice	Numerical Score
Too simple, almost no details	Visually very light	1
Simple details	Visually light	2
Balanced between simplicity and complexity	Visual equilibrium	3
High complexity	Visually heavy	4
Too high complexity	Visually too heavy	5

Table 10. Visual complexity survey results.

	Building Name
	Al Jawhara Al Babtain Mosque	Abdulaziz Alfaris mosque	Abdullah Nasser Al-Mohainy mosque	Awaidhah mosque	King Abdullah Bin Abdul Aziz mosque
Survey Average (Scale evaluation)	3.570	4.184	4.886	5.772	3.430
Scale evaluation mapped to FD (x0.1) + 1	1.357	1.418	1.489	1.577	1.343
Multi-Choice evaluation options.	Multi-Choice evaluation count
Too simple	10	15	10	8	36
Simple details	72	38	22	21	56
Balanced	35	55	63	42	21
High complexity	1	1	1	1	1
Too high complexity	1	1	1	8	1
Choice average converted to scale out of 5	2.351	2.325	2.211	1.930	1.930
Survey correlation between the two questions averages	−0.327

Table 11. Correlation between fractal dimension and visual complexity survey results.

Fractal Dimension Scale	Correlation Coefficient	Correlation Interpretation
FD1	−0.208	The highest correlation coefficient is by far in FD3, indicating that the most accurate interaction is the third compared to survey results.
FD2	−0.046
FD3	0.430

Table 12. Visual equilibrium survey results.

	Building Name
	Al Jawhara Al Babtain mosque	Abdulazis Alfaris mosque	Abdullah Nasser Al-Mohainy mosque	Awaidhah mosque	King Abdullah Bin Abdul Aziz mosque
Survey scale average	4.763	5.018	5.439	5.816	4.044
Mapped scale results (x − 5) −50	11.842	−0.877	−21.930	−40.789	47.807
Multichoice options	Multichoice questions count
Visually very light	8	12	4	6	35
Visually light	39	35	20	18	42
Visual equilibrium	61	48	54	35	28
Visually heavy	10	23	35	43	8
Visually too heavy	1	1	6	17	6
Choice average converted to scale out of 5	2.754	2.833	3.298	3.544	2.325
Survey correlation between the two questions averages	0.988

Table 13. Visual equilibrium survey correlation with visual weight calculations.

Results	Correlation Coefficient	Correlation Interpretation
Mapped survey scale results	0.845	High descriptive correspondence between mapped survey scale and computed VW in this five-façade sample.
Numerical Multichoice results	−0.791	High descriptive correspondence in the opposite sign (due to inverted coding), consistent with the internal agreement between the two survey formats

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Aloshan, M.A.; Sanad, E.M.M. An Integrated Framework for Architectural Visual Assessment: Validation of Visual Equilibrium Using Fractal Analysis and Subjective Perception. Buildings 2026, 16, 345. https://doi.org/10.3390/buildings16020345

AMA Style

Aloshan MA, Sanad EMM. An Integrated Framework for Architectural Visual Assessment: Validation of Visual Equilibrium Using Fractal Analysis and Subjective Perception. Buildings. 2026; 16(2):345. https://doi.org/10.3390/buildings16020345

Chicago/Turabian Style

Aloshan, Mohammed A., and Ehab Momin Mohammed Sanad. 2026. "An Integrated Framework for Architectural Visual Assessment: Validation of Visual Equilibrium Using Fractal Analysis and Subjective Perception" Buildings 16, no. 2: 345. https://doi.org/10.3390/buildings16020345

APA Style

Aloshan, M. A., & Sanad, E. M. M. (2026). An Integrated Framework for Architectural Visual Assessment: Validation of Visual Equilibrium Using Fractal Analysis and Subjective Perception. Buildings, 16(2), 345. https://doi.org/10.3390/buildings16020345

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

An Integrated Framework for Architectural Visual Assessment: Validation of Visual Equilibrium Using Fractal Analysis and Subjective Perception

Abstract

1. Introduction

1.1. Research Problem and Gap

1.2. Objectives

1.3. Contributions

1.4. Scope and Limitations

2. Literature Review

2.1. Visual Perception

2.2. Computational Visual Assessment

2.3. Fractal Dimension as a Measure of Visual Complexity

2.4. Perceptual Visual Forces and Visual Equilibrium

2.5. Visual Preference as Validation Evidence

2.6. Gap and Rationale

3. Methodology

3.1. Study Sites, Sampling Frame, and Inclusion Criteria

3.2. Image Acquisition and Elevation Representation (Level 3)

3.3. Fractal Dimension (FD) Computation

3.4. Visual-Equilibrium Computation (Larrosa)

3.5. Visual-Preference Survey (External Validation)

3.5.1. Survey Objectives

3.5.2. Survey Structure

3.5.3. Survey Design

3.5.4. Survey Implementation

3.5.5. Ethical Approval and Informed Consent

3.6. Mosque Selection Criteria

4. Results and Discussion

4.1. Fractal Dimension Calculation for the Mosque Facades

4.2. Fractal Dimension Results Interpretation

4.3. Visual Equilibrium Calculations

4.4. Interpretation of Visual-Equilibrium Results

4.5. Comparison of Fractal Dimension and Visual Weight

4.6. Visual Preference Survey Approach

4.6.1. Results Interpretation

4.6.2. Visual-Complexity Survey Results

4.6.3. Visual-Equilibrium Survey Results

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI