1. Introduction
Visual assessment is a critical tool in understanding architectural character and is essential for guiding design and preserving urban identity. Given its inherently subjective nature, researchers have developed quantitative techniques to improve objectivity, including parsing methods for façade pattern analysis [
1], perceptual mathematical models [
2], and fractal dimension (FD) calculations first introduced by Carl Bovill in 1996 [
3]. These computational approaches vary across visual complexity, perceptual balance, statistical components, and image segmentation. Focusing on Riyadh’s mosque architecture—central to the city’s identity and cultural heritage—this study integrates objective metrics with subjective evaluation to ensure that computational results resonate with human visual experience and aesthetic judgment [
4]. Riyadh has a distinct architectural character rooted in the Najdi region, whose vernacular mud-brick and stone architecture is characterized by compact courtyard layouts, thick walls, simple volumes, and distinctive triangular or geometric openings and crenellation [
5,
6]. Contemporary architecture in Riyadh has, to some extent, preserved and reinterpreted this identity in new mosque and civic projects, using traditional Najdi elements within modern materials and forms [
7,
8,
9]. In this study, we apply computational visual assessment to a small set of contemporary mosque buildings in Riyadh, built within approximately the last two decades, as an exploratory case study of this recent phase. Throughout, human observers remain central: computational measures are interpreted only in relation to expert and advanced-student judgments and are proposed as aids to, not substitutes for, the evaluation of visual character.
1.1. Research Problem and Gap
Although fractal dimension has been widely used and calibrated as a measure of visual complexity, interpretation still requires contextual understanding. By contrast, visual equilibrium has only recently been operationalized by Larrosa’s perceptual visual-force model, and published applications provide only a small number of case studies and limited-sample comparisons to human judgments [
2,
10], in sharp contrast to the much larger set of FD studies that link complexity scores to perception [
11,
12,
13,
14,
15]. Current design review and legislation demand methods that are both objective and practically interpretable; however, the relationship between visual complexity and visual equilibrium, and the extent to which it aligns with human preference, remains under-examined. A comparative, multi-method framework is therefore needed to clarify these relationships and to test the validity of perceptual models against established complexity metrics and subjective judgments.
1.2. Objectives
This exploratory pilot study examines five large mosque façades on main roads in Riyadh and triangulates three techniques: (i) quantitative fractal-dimension analysis for visual complexity, (ii) quantitative visual-force calculations for visual equilibrium (Larrosa), and (iii) qualitative/subjective visual preference. The aim of this research is not to classify mosque types in Riyadh or to map the full diversity of mosque styles, but rather to explore the application of a triangulated visual-assessment framework—combining fractal dimension (FD), Larrosa’s visual equilibrium (VW), and expert perception—using a small set of façades as a testbed for comparing the three methods. Specifically, we:
Quantify visual complexity and visual equilibrium for representative mosque façades in Riyadh;
Provide initial convergent evidence for Larrosa’s visual-force method by testing its correlation with subjective perceptions of visual balance in an expert sample of architects and advanced students;
Investigate the relationship between visual equilibrium and visual complexity by comparing Larrosa’s outputs with FD results;
Establish a comprehensive, multi-method system of visual inquiry tailored to Riyadh’s architectural context.
Working propositions. Because openings/voids reduce both visual weight and FD while projections/solids have the opposite effect, we anticipate systematic association between equilibrium and complexity at the façade scale. Throughout, FD and VW are treated not as stand-alone descriptors but as candidate quantitative summaries whose usefulness is evaluated against expert perceptual judgments.
1.3. Contributions
This paper offers four contributions:
Exploratory, sample-specific convergent-validity evidence for Larrosa’s visual-equilibrium model in this convenience sample of experts and advanced students, via correlations with subjective preference and FD, without claiming general validation beyond the present façades.
A comparative account of visual equilibrium (visual weight) and visual complexity (FD) for the same buildings.
An integrated, multi-method quantitative framework for assessing Riyadh’s Mosque architecture.
An illustrative visual-assessment framework that can be adapted to other building types in future studies.
1.4. Scope and Limitations
The study does not measure beauty or psychological impact per se. Rather, it computes visual complexity and visual equilibrium and compares them with subjective impressions to understand Riyadh’s visual character. Religious buildings are used as stimuli because their salience supports reliable subjective judgments; “beauty” serves only as a sampling rationale for spiritually impactful cases.
The empirical component of this study is based on a deliberately small, purposive sample of five large mosques located on primary roads in Riyadh. These buildings were chosen because (i) their façades are highly visible in the everyday visual field of city users, (ii) their unobstructed main elevations enable reliable and comparable FD and visual-equilibrium calculations, (iii) they include minarets that create strong vertical–horizontal proportioning relevant to Larrosa’s approach, and (iv) CAD drawings and models were accessible for manual analysis. No formal city-wide sampling frame of “large, main-road mosques” was available at the time of the study; therefore, the five cases should be regarded as a convenience-based illustrative subset, not a statistically representative subsample of Riyadh’s mosque stock. Any statements about “Riyadh’s mosque architecture” in this paper should therefore be read as referring specifically to this subset of large, visually prominent main-road mosques. The findings provide within-sample evidence about the relationships between FD, visual equilibrium, and subjective preference, but do not estimate the full distribution of visual characteristics across the city’s mosque stock or the wider population of large main-road mosques; the five study sites are treated as an illustrative case series within this visually salient subtype. In addition, because brief construct explanations were provided immediately before the rating tasks, metric–survey correspondences should be read as preliminary, exploratory convergent evidence within this specific sample rather than as independent validation of Larrosa’s method in mosque architecture more broadly.
The remainder of this paper presents a brief literature review (
Section 2), the methods for FD, visual equilibrium, and the survey (
Section 3), the empirical results and triangulated analysis (
Section 4), and conclusions and directions for future research (
Section 5).
2. Literature Review
Religious buildings have historically shaped the visual and spiritual identity of cities and civilizations [
16,
17]. In Saudi Arabia, mosques are central to urban heritage, and in the Najd region—particularly Riyadh—distinctive compositional traits convey clarity, openness, and multifunctionality that link tradition with a forward-looking urban vision [
9].
Although religious architecture is materially constructed, it often evokes spiritually inflected experiences [
18]; this underlines the cognitive–perceptual dimensions of how observers engage with architectural form. Classic accounts of the sublime (Kant; Burke) describe awe and boundlessness when confronting works perceived as powerful or vast [
19,
20], a response frequently reported in religious settings [
18]. This strong subjective salience motivates the use of visual preference as a benchmark when evaluating computational visual-assessment methods in religious architecture.
2.1. Visual Perception
Vision typically dominates multisensory appraisal of the built environment [
21] and involves both reception and higher-order cognition [
22]. Building on a Kantian reading of Gestalt theory, Arnheim emphasized that vision is not a mechanical recording but an active organization of sensory material according to principles such as simplicity, regularity, and balance [
4,
23]. Accordingly, architectural perception privileges structural patterns over isolated elements, which supports the development of computational measures aimed at capturing those higher-order configurations.
2.2. Computational Visual Assessment
To reduce subjective bias, researchers have developed computational approaches spanning:
(i) Image parsing and processing, e.g., line segmentation, edge detection, component-based façade parsing, and Hough-transform-aided feature extraction [
1,
24,
25,
26]; (ii) visibility analyses, such as façade isovists [
27]; and (iii) metric models that quantify particular constructs, notably fractal dimension (FD) for visual complexity and perceptual visual-force models for equilibrium [
2,
11]. These method families differ in what they measure (complexity, balance, visibility), how inputs are segmented, and how results are interpreted for design review and codes.
In parallel with these classical image-processing and metric-based approaches, recent years have seen rapid growth in computer-vision and deep-learning methods applied to façades and urban scenes. End-to-end convolutional neural networks now perform façade parsing and pixel-wise semantic segmentation of façade components (e.g., windows, doors, balconies, cornices), enabling automated extraction of architectural elements from street-level imagery for large datasets [
28]. Building on such pipelines, subsequent work has focused on deep-learning models for detecting and quantifying façade elements and opening patterns, which supports tasks such as window-to-wall ratio estimation and façade-element statistics at the district or city scale [
29].
Beyond individual buildings, CNN-based frameworks have been used to measure façade color distributions and functional classifications from street-view images at city scale, and to reconstruct 3D façade geometry for urban modeling and heritage documentation [
30]. Synthetic reviews of computer-vision analysis of buildings and the built environment emphasize that these deep-learning approaches excel at large-scale automation and scene understanding yet often rely on substantial labeled datasets and operate as high-dimensional ‘black-box’ predictors whose internal representations are not easily interpretable by designers [
31]. For design-review panels and code authorities, this opacity is a practical barrier: any quantitative index used in approvals or guidelines must be explainable, traceable to visible geometric relations on the façade, and expressible as simple ranges or thresholds that can be checked directly on drawings.
Within this broader landscape, the present study does not propose a new deep-learning architecture. Instead, it focuses on a complementary problem: testing whether interpretable, low-dimensional metrics—fractal-dimension complexity and Larrosa’s proportion-based visual equilibrium—align with experts’ and students’ subjective judgments for a coherent architectural typology (Riyadh mosques). While FD has already been used as an interpretable scalar index of visual complexity in numerous architectural and perceptual studies, Larrosa’s equilibrium formulation has so far appeared only in a small number of case studies and limited sample applications and has not yet been calibrated against expert judgments across building types. This imbalance helps explain why equilibrium remains under-explored relative to complexity, despite its close conceptual connection to design talk of balance and visual weight. By triangulating FD, equilibrium scores, and visual-preference data on the same façades, the study provides validation evidence for perceptual models that can be directly read and adjusted by designers and regulators. These transparent metrics are intended to complement, rather than replace, deep-learning pipelines by offering architecturally legible indicators that can inform design review and code development in contexts where large training datasets and high-end computational infrastructure may not be available.
2.3. Fractal Dimension as a Measure of Visual Complexity
“Mandelbrot’s” fractal concept formalized scale-dependent self-similarity in natural and man-made patterns [
32,
33,
34]. This mathematical representation has enabled the creation of complex images such as the Mandelbrot set and Julia sets. In architecture, FD—commonly estimated via box-counting—has been interpreted as a proxy for visual complexity [
12]. Early applications by Bovill quantified characteristic façade complexity (e.g., the Robie House) [
3,
17], while subsequent work refined scaling protocols and stressed careful image preparation (thresholding, segmentation, iteration depth) to improve reliability (e.g., Ostwald and Tucker) [
12,
17,
35]. Studies increasingly combine manual and digital workflows (e.g., AutoCAD, version 2026; Autodesk, 2025-assisted box counting) to operationalize FD across typologies [
36]. The consensus is that FD is robust for relative comparisons of complexity, yet its perceptual meaning is context-dependent; FD magnitudes require interpretation against typology, composition, and viewing scale [
11,
12,
13].
Figure 1 illustrates the box-counting procedure applied to one of the study mosque façades at three grid sizes, corresponding to the course (FD1), intermediate (FD2), and fine (FD3) scale bands used in this study.
2.4. Perceptual Visual Forces and Visual Equilibrium
Building on Arnheim’s visual dynamics—where perception organizes form according to principles such as balance and the “upward thrust” of verticals—Larrosa formalized a computational framework that models visual equilibrium as the resultant of two opposing, proportion-driven forces: an ascending (principal) force associated with upward thrust and a descending (complementary) force associated with perceived visual weight [
4]. Conceptually, a façade approaches equilibrium when the algebraic sum of these forces is near zero; large negative values indicate a heavy, ground-piercing tendency, while large positive values indicate a light, buoyant tendency [
2,
10]. Although operational and attractive for design review, Larrosa’s formulation has received comparatively less empirical validation against human judgments than fractal-dimension (FD) approaches—hence the value of studies that place equilibrium measures alongside FD and preference data [
2,
10]. Whereas FD has been examined in numerous perceptual studies that relate its values to preference and other visual responses [
11,
12,
13,
14,
15], empirical tests of Larrosa’s equation remain limited to a handful of examples and small-scale evaluations, with no broad calibration across building types.
Conceptual example (
Figure 2). Variations in column proportion and entablature depth illustrate how perceived balance shifts with geometric relations. In a didactic trio, a 1:9 column with an oversized entablature yields a net negative (heavier) outcome; a 1:8 proportion approximates equilibrium; and a 1:7 with a shallower entablature produces a net positive (lighter) outcome—aligning with Arnheim’s account of counter-tension between vertical and horizontal components [
4].
2.5. Visual Preference as Validation Evidence
In environmental design research, visual preference is frequently employed to assess whether computed visual metrics show patterns that are consistent with perception, serving as an external perceptual check rather than a definitive criterion of truth for objective measures [
11]. Prior work reports systematic relationships between complexity and preference; for example, Abboushi found peak visual interest at fractal dimension (FD) values of approximately 1.5–1.7 when testing architecture students [
15], while Hussein showed a direct effect of manipulated visual complexity on preference ratings [
14]. Methodologically, studies often use two-dimensional projections to control parameters and isolate visual variables, which improves internal validity and analytic precision [
11]. Taken together, this literature supports using preference data as an independent perceptual benchmark when interpreting computational assessments (e.g., FD-based complexity and equilibrium scores). Preference ratings help relate subjective appraisal to objective quantification, while recognizing that any correspondence is partial, context-dependent, and does not constitute a general validation of the metrics [
11,
14,
37].
2.6. Gap and Rationale
Across these strands, two linked gaps persist: (i) equilibrium metrics remain empirically under-validated relative to FD, and (ii) few studies jointly compare FD and equilibrium on the same façades while testing both against subjective preference. Addressing this gap motivates the present triangulated design that relates FD-based complexity, proportion-based equilibrium (Larrosa), and human judgments within a coherent typological sample (Riyadh mosques), as detailed in the Methods section.
3. Methodology
This study implements a triangulated, mixed-methods design to (i) quantify visual complexity via fractal dimension (FD), (ii) quantify visual equilibrium via Larrosa’s perceptual visual-force formulation, and (iii) obtain visual-preference judgments, then statistically examine their relationships on the same set of mosque façades in Riyadh. The workflow comprises sample selection and image acquisition, standardized elevation representation (Level 3), FD computation, equilibrium computation and aggregation, and an external validation survey, followed by correlation analyses among metrics and survey outcomes. For these façade-level associations (
n = 5), we summarize correspondence using Spearman’s ρ as descriptive effect sizes; no multiple-testing corrections are applied, and survey predictors are façade-level mean ratings rather than individual-level covariates (
Figure 2,
Figure 3 and
Figure 4;
Table 1).
3.1. Study Sites, Sampling Frame, and Inclusion Criteria
Mosques were sampled from primary roads in Riyadh to reflect the city’s everyday visual field. Inclusion criteria: (i) location on a main artery; (ii) comparatively large scale and clear street presence (often with dual minarets); (iii) architectural diversity spanning traditional and contemporary idioms; (iv) unobstructed visibility of the principal road-facing elevation. Exclusion criteria: substantial occlusions (e.g., trees, scaffolding) covering >20% of key elevation features; nighttime images; severe perspective obstruction. Final selections and map links are listed in
Table 1. Because no comprehensive, type-coded inventory of Riyadh mosques was available, we did not construct a formal statistical sampling frame or estimate the total number of eligible large main-road mosques; instead, the five cases in
Table 1 were purposively chosen to span traditional and more contemporary compositions within the inclusion criteria. In future work, we will extend the sampling frame to neighborhood-scale mosques across multiple Riyadh districts—including single-minaret and attached-minaret types and varied courtyard configurations—to test generalizability beyond major arterial sites.
3.2. Image Acquisition and Elevation Representation (Level 3)
Each mosque was photographed orthogonally to the main road façade under daylight with comparable illumination. Images were rectified (keystone/perspective correction as needed) and cropped to the principal elevation. Following Ostwald’s five-level representation scheme, we used Level 3 (overall detail beyond openings, short of material texture) to standardize drawings for analysis, given prior evidence that Levels 3 and 4 yield similar FD results while Level 3 reduces preprocessing variability [
38]. The façade samples were re-generated as monochrome AutoCAD (version 2026, Autodesk Inc., San Francisco, CA, USA) elevation drawings; no image-based thresholding or other digital pre-processing was applied. Edges were manually traced in AutoCAD (including primary outlines, openings and cornices, while excluding signage, overhead wires, and temporary objects), and all lines were standardized to a line width of 0.20 mm before applying the six box-counting grid sizes.
3.3. Fractal Dimension (FD) Computation
Fractal dimension (FD) was used to operationalize visual complexity and was computed via box-counting on the Level-3 elevation drawings [
11,
13,
39]. FD was calculated for (a) each mosque elevation and (b) the adjacent skyline window for contextual comparison [
11,
12,
13,
36,
39]. All computations were performed manually in AutoCAD on vector façade drawings rather than on raster images.
Vector drawings were first cleaned to remove redundant lines and to ensure a uniform line thickness before grid overlay. A square grid was then superimposed at six box sizes (7200, 3600, 1800, 900, 450, and 225 mm). These six sizes were grouped into three two-scale segments corresponding to FD1, FD2, and FD3 (coarse, intermediate, and fine scales). For each segment, two box sizes (s1, s2) were used (e.g., 7200 mm and 3600 mm), and the number of occupied boxes at each size (Ns1, Ns2) was counted manually.
FD for each segment was then computed from the two log–log points using base-10 logarithms as:
where (s) is the box size and (ns) is the occupied-box count at that scale. Because each FD estimate is based on only two box sizes, the regression line through these two points is exact (R
2 = 1.000), and 95% confidence intervals for the slope cannot be estimated (zero degrees of freedom).
Accordingly, throughout this paper FD1–FD3 are treated as fixed descriptive indices, conditional on the chosen two-scale grid and manual counting procedure; no standard errors, confidence intervals, or formal goodness-of-fit tests are associated with the FD slopes themselves. The three bands (FD1–FD3) were specified a priori, following Bovill’s box-counting scheme as refined by Ostwald, to approximate coarse, intermediate, and fine architectural scales; FD3, based on the smallest box sizes, was expected to be most sensitive to self-similar façade detail, and none of the bands were adjusted in response to the survey results. These three scale bands (FD1–FD3) were defined a priori on theoretical grounds and were not adjusted according to observed correlations or survey results; all are retained as descriptive indices to allow multi-scale comparison.
3.4. Visual-Equilibrium Computation (Larrosa)
Larrosa’s perceptual force model estimates a façade’s equilibrium as the algebraic combination of an ascending (upward-thrust) component, and a descending (visual-weight) component derived from shape proportions. To operationalize this on mosque elevations, we (i) partitioned each Level-3 elevation into geometrically coherent rectangular parts that reflect visible relations and compositional unity, (ii) measured each part’s vertical extent (a) and horizontal extent (b), and (iii) aggregated signed force contributions to a net visual weight VW, we computed:
Canonical formulation (
Figure 3). For a rectangular part with vertical side
and horizontal side
, Larrosa defines a quadratic proportion
and two forces, ascending
and descending
, typically expressed as [
2]:
In this study, all dimensions “a” and “b” are measured in meters on scale-correct CAD elevations, and the logarithm in the proportion formula uses base-10. The derived quantities P, |Fp|, |Fc| and VW are treated as dimensionless indices. Component-level values reported are rounded to three decimal places, and the ‘Total visual weight’ in each table is obtained by algebraically summing the component VW values in the last column.
Figure 3.
Force vectors and symbols for a rectangular part () used in Larrosa’s formulation. Notes for reporting: (i) state the logarithm base; (ii) define the measurement unit and image scale; (iii) clarify whether denotes the horizontal extent and the vertical extent across all parts; (iv) specify rounding/precision.
Figure 3.
Force vectors and symbols for a rectangular part () used in Larrosa’s formulation. Notes for reporting: (i) state the logarithm base; (ii) define the measurement unit and image scale; (iii) clarify whether denotes the horizontal extent and the vertical extent across all parts; (iv) specify rounding/precision.
Aggregation over a façade (
Figure 4). To estimate the façade’s resultant visual weight:
- 1.
Partition the elevation into geometrically unified parts (rectangles) that reflect visible relations and compositional logic.
- 2.
Measure for each part and compute , , .
- 3.
Apply sign convention: ascending components are summed as positive; descending as negative.
- 4.
Handle voids/openings: treat openings as opposite-signed contributions (subtract their ascending component; add their descending component) to reflect their lightening effect.
- 5.
Sum across all parts to obtain the net value . Interpret as:
- ○
: visually light, upward tendency;
- ○
: visual equilibrium;
- ○
: visually heavy, ground-piercing tendency.
Figure 4.
Illustration of visual weight calculation method, drawn by the authors. Workflow for façade partitioning and force aggregation; sign convention and treatment of openings.
Figure 4.
Illustration of visual weight calculation method, drawn by the authors. Workflow for façade partitioning and force aggregation; sign convention and treatment of openings.
In this study, each Level-3 elevation was partitioned into rectangular parts according to a simple protocol: visually coherent architectural components (e.g., prayer-hall block, portico, podium, minarets, major cornices) were treated as single parts; openings and recesses were treated as voids with opposite-signed contributions; and homogeneous regions with similar proportions and function were kept grouped rather than arbitrarily subdivided. Segmentation was performed by the first author and reviewed by the second author to ensure consistent application of these rules across all façades. We did not, however, perform formal inter-operator agreement testing (e.g., blind repeat partitioning or ICC/Kappa statistics), so the VW estimates reported here should be interpreted as conditional on this specific operationalization of Larrosa’s method and as pilot-level, hypothesis-generating values pending multi-rater validation.
3.5. Visual-Preference Survey (External Validation)
The validation process involved a survey designed to compare calculated results to subjective impressions, testing the reliability of the quantitative data against human perception. The survey was created by the authors to directly inquire about participants’ visual impressions of the sample buildings. It was distributed both through an online link and directly shared with architects, architecture professors, and final-year architecture students via professional groups and personal contacts. The purpose and intent of the questions were clearly explained, with specific emphasis on accuracy for student participants. The survey thus acts as a perceptual anchor, providing an exploratory check on how fractal-dimension and visual-weight computations align with expert impressions within this sample. In this way, the human factor is built into the methodology rather than treated as an afterthought.
3.5.1. Survey Objectives
The survey was designed to achieve the following objectives:
To determine whether participants’ subjective impressions align with the calculated visual complexity (fractal dimension).
To examine whether computed visual complexity and equilibrium show convergent patterns with expert participants’ judgments of the same façades, as an external check on the computational methods
To establish the correlation between subjective and objective visual assessment approaches, thereby testing the reliability of the computational models.
3.5.2. Survey Structure
The survey consisted of three main sections:
Demographic Information—including participant country, gender, and professional background.
Visual Complexity—questions designed to assess subjective impressions of each sample’s visual complexity.
Visual Equilibrium—questions focused on subjective perceptions of visual balance for each sample.
3.5.3. Survey Design
The questions were structured as a direct inquiry into the subjective assessment of visual complexity and visual equilibrium for each of the mosque samples. To ensure clarity and improve validity, two measurement scales were used:
Multiple-choice questions, offering descriptive options.
Rating scales, allowing participants to score attributes on a numerical scale.
The goal was to validate the computed results for visual complexity and equilibrium by comparing them with user perception. Therefore, questions were limited to these two variables, intentionally avoiding broader subjective themes that could skew the focus or introduce bias. The survey was designed as a validation tool for the computational visual assessment methods.
3.5.4. Survey Implementation
The survey was conducted in Saudi Arabia and Egypt, to ensure a culturally relevant yet slightly varied assessment of the mosque architecture. It was administered and monitored by the authors over a two-month period (April–May 2024) and collected 114 valid responses from three groups:
21 architecture academics
47 professional architects
46 final-year architecture students
The data showed no outliers, and it was noted that participants had a clearer understanding of visual equilibrium than visual complexity, likely due to the more intuitive nature of balance compared to intricacy.
Survey Participant Selection Criteria
Participants were selected according to the following criteria:
Relevant Academic or Professional Background: Individuals with experience or education in architecture, visual design, or related disciplines, ensuring a foundational understanding of visual assessment.
Diverse Expertise Levels: A mix of academics, practitioners, and advanced students to represent a broad but relevant range of perspectives.
Cultural Context: Participants from Saudi Arabia and Egypt ensured familiarity with mosque architecture and allowed exploration of cross-cultural perceptual consistency.
3.5.5. Ethical Approval and Informed Consent
The survey collected anonymous professional opinions from adult volunteers in Saudi Arabia and Egypt about non-sensitive architectural façades. In accordance with the applicable institutional guidelines at the time of data collection, this type of minimal-risk, anonymous opinion survey of adults was exempt from formal ethics committee review. Participation was voluntary, and information on the study purpose, procedures, and the right to withdraw at any time was provided on the first page of the questionnaire. The form explicitly informed participants that no personal details (e.g., name or email address) would be collected automatically unless they chose to provide them. No identifying personal data were required, and responses were stored anonymously on a password-protected institutional drive and used solely for research purposes.
3.6. Mosque Selection Criteria
The mosques selected for this study are situated along the main roads of Riyadh, chosen for their impactful visual presence to city observers. Riyadh, characterized by its sprawling horizontal layout, necessitates navigation primarily through its main roads. Consequently, the visual perception of the city is predominantly shaped by these thoroughfares, with mosques along these routes forming a significant part of the city’s visual landscape.
Moreover, larger mosques are prioritized over smaller ones due to their size, which surpasses that of average buildings. Additionally, larger mosques often feature two minarets, contributing to their visual balance and prominence. Riyadh’s mosques exhibit a diverse range of architectural styles, and the selected mosques encompass this variety to explore the visual complexity inherent in each style and its contextual suitability. The selected mosques should meet the following criteria:
Location: Only mosques located along Riyadh’s primary roads were chosen, as these structures significantly contribute to the city’s visual identity due to their high visibility.
Scale and Prominence: Larger mosques were prioritized because of their greater visual impact and architectural detailing, often including dual minarets which contribute to visual symmetry and equilibrium.
Architectural Diversity: The selected samples represent a diverse range of architectural styles prevalent in Riyadh, ensuring that the study captures variations in visual complexity and equilibrium across different designs, including modern and traditional styles. This diverse range of visual style represents the visual characteristics of Riyadh.
Cultural and Contextual Significance: Each mosque plays a representative role in reflecting Riyadh’s traditional and modern architectural trends.
This selection strategy intentionally privileges large, highly visible mosques on primary roads and thus constitutes a non-probability, purposive sample. As such, it is suitable for controlled comparison of FD, visual equilibrium, and preference ratings across a coherent typology, but it does not support statistical generalization to all mosque types in Riyadh. We therefore interpret the results as indicative patterns for this specific class of prominent mosques, to be tested in future research using larger, stratified samples that include neighborhood-scale and historic mosques across multiple districts.
4. Results and Discussion
We assessed the selected mosque façades using three complementary lenses: fractal dimension (FD; a proxy for multi-scale visual complexity), Larrosa’s visual equilibrium (net visual weight VW), and a visual-preference survey (behavioral external criterion). We first report each measure, then examine their associations (FD—VW; VW—preference; FD—preference). Survey- and VW-based effects are reported with point estimates and, where estimable, 95% confidence intervals. Façade-level associations among FD, VW, and survey means (n = 5) are summarized using Spearman’s ρ, interpreted descriptively as exploratory effect sizes; we do not report p-values or adjust for multiple testing at this level. FD values are treated as façade-level indices without associated confidence intervals.
4.1. Fractal Dimension Calculation for the Mosque Facades
Each mosque was modeled from architectural drawings, and a Level-3 front elevation (massing and primary edges, without textures) was prepared in AutoCAD for box-counting. For FD calculation, we applied the manual CAD-based procedure described in
Section 3.3. A square grid was superimposed at six box sizes (7200, 3600, 1800, 900, 450, and 225 mm), grouped into three two-scale segments corresponding to FD1 (coarse), FD2 (intermediate), and FD3 (fine) (
Table 2). For each segment, the number of occupied boxes was counted manually at the two box sizes, and FD was computed from the two log–log points using the equation given in
Section 3.3. Because each FD estimate is based on only two box sizes, the regression line through these points is exact (R
2 = 1.000) and 95% confidence intervals for the slope cannot be estimated. In the analyses that follow, we therefore treat FD as a deterministic façade-level summary and do not attach sampling-based uncertainty estimates to these values.
4.2. Fractal Dimension Results Interpretation
Applying the box-counting approach with six box sizes yielded three fractal-dimension (FD) estimates per façade. While smaller box sizes are often expected to produce more accurate results, this is not always the case. The most reliable estimate is typically the one that repeats across scales, forming a consistent pattern. In this paper, however, three visual-assessment approaches are used for verification; therefore, we retain all three FD results for correlation with visual equilibrium and visual preference. Visual-preference data can help compare FD estimates by highlighting the most correlated scale. The finest scale band (FD3) exhibits the strongest correspondence with perceived visual complexity in this sample and is therefore treated as the most informative FD descriptor for the present analysis, while FD1 and FD2 are retained as supplementary multi-scale descriptors.
Across the sample, mosques scored FD values between 1.2 and 1.89, indicating a wide range of visual complexity within Riyadh’s architectural character. Riyadh’s architecture generally tends toward simplicity, light colors, and minimal ornament. Relative to its predecessor—the Najdi style in the old center of Riyadh—medium to low FD values are expected. Ostwald classifies FD as low (≤1.5), exemplified by Le Corbusier’s work, and high (≥1.5), as in Frank Lloyd Wright’s architecture, establishing 1.5 as a balance point and arguing that higher FDs are closer to natural patterns [
11]. Taylor identifies the 1.3–1.5 FD range as stress-reducing in architecture [
40]. In this context, and consistent with repeated results in each case, the mosques in this study generally lean toward lower FD values (below 1.5), aligning with the above expectations.
By case: Al Babtain mosque showed consistent results (1.37–1.44) across the three iterations, indicating a stable self-similarity pattern. Abdulaziz Alfaris mosque dropped from ~1.6 to 1.2 at the third iteration, suggesting fewer close-scale details; façade curves likely elevated the first-iteration value. Al-Mohainy mosque increased from 1.26 (first) to 1.48 and 1.54 (second and third), revealing more small-scale detail, likely due to richer openings and minaret features. Awaidhah mosque decreased from the first to the second iteration and stabilized at 1.52 in the third; its slender design elevated the first-iteration value, whereas small openings reduced the second. King Abdullah mosque began high at 1.89 and decreased to 1.47–1.48 in the second and third iterations. Across mosques, the third iteration appears most reliable: the first and second iterations matched in two cases, while the second and third matched in three, indicating greater stability by the third iteration. Validation against the survey results is required to confirm this inference.
For the third iteration (FD3), visual-complexity values range from 1.203 (minimum) to 1.547 (maximum), forming a consistent pattern of low fractal-dimension complexity, generally below the 1.5 midpoint. This suggests that Riyadh’s mosques tend toward lower visual complexity overall. For the third iteration (FD3), visual-complexity values range from 1.203 to 1.547, forming a consistent pattern of low fractal-dimension complexity within this sample of large main-road mosques. This suggests that, among such visually prominent cases, mosque façades in Riyadh tend toward lower visual complexity overall.
4.3. Visual Equilibrium Calculations
Mosque façades were analyzed in AutoCAD based on their geometry and simplified into rectangular components, as illustrated in
Figure 5,
Figure 6,
Figure 7,
Figure 8 and
Figure 9. Following Larrosa’s method, openings were calculated with reversed sign [
2]. Repeated rectangular elements were summed to compile the calculation tables [
Table 3,
Table 4,
Table 5,
Table 6 and
Table 7].
The façade partitioning method is based on the illustration in
Figure 4. As established by Larrosa, ornaments and curved elements are reduced to simple rectangular boundary shapes, ignoring internal window patterns and treating windows as openings that reduce weight in opposition to solid elements and projections. Curved elements such as arches and circles are bounded by rectangles for calculation purposes, because Larrosa’s equation operates only on rectangular shapes.
The façade segmentation level used for VW adopts the same Level-3 segmentation as for FD (
Section 3.2), to match the level of façade representation and accuracy and to make the results more comparable. Larrosa’s original method did not specify a calibrated façade-segmentation level, so this alignment ensures consistency between the FD and VW analyses.
Recesses and glazing are treated as voids relative to the solid stone elements projecting on the façade (see
Figure 5); see detailed calculations in
Table 3.
- 2.
Abdulaziz Alfaris mosque
The column capitals are treated as distinct elements forming a repeated pattern; each capital is calculated separately. The glazing behind is treated as openings. Proportions are visually dominated by the columns and the entablature (
Figure 6). Detailed calculations are provided in
Table 4.
- 3.
Abdullah Nasser Al-Mohainy mosque
The minarets subdivide the façade into smaller parts, affecting overall proportions. All doors and windows are treated as openings that interrupt the main volume (
Figure 7). Detailed calculations are provided in
Table 5.
- 4.
Awaidhah mosque
Details within openings are ignored—each opening is treated as a single unit to preserve the geometric unity of patterns. Visible volumes are the primary determinants of proportions. Because of the top continuous elements that ensure horizontal continuity, the main building is treated as one element (
Figure 8). Detailed calculations are provided in
Table 6.
- 5.
King Abdullah Bin Abdul Aziz mosque
The proportions resolve into three primary objects. The minaret is dominant and vertical, forming a single unit anchored to the ground. The main building is divided into two strongly horizontal elements, producing a substantially heavy visual weight. In this case, color contrast is the principal visual differentiator (
Figure 9). Detailed calculations are provided in
Table 7.
4.4. Interpretation of Visual-Equilibrium Results
The results range from +36 to −30, indicating substantial diversity in visual proportions across cases. For example, in
Table 5 the component with a = 9 m and b = 20.8 m yields
p = 0.272, |Fp| = 5.665 and |Fc| = 33.045; its visual-weight contribution (VW = −27.380) is obtained as the signed difference between the ascending and descending forces. The façade-level total VW = −30.481 is then calculated by summing all component VW entries in the table. A notable pattern emerges in the minaret proportions, which tend to approach visual equilibrium in all cases except King Abdullah mosque. From the first row of each table, the minaret visual weights are: −0.1 (Al Babtain), −1.3 (Alfaris), +0.9 (Al-Mohainy), and −0.8 (Awaidhah)—a consistent pattern of near-zero (i.e., visually balanced) values. King Abdullah mosque’s minaret, at −2.3, deviates modestly due to its greater height, indicating increased descending visual weight. Given the prominence of minarets in Riyadh’s architecture—and their proportions and forms that differ from other regions—it is noteworthy that, overall, they contribute positively to façade-level equilibrium.
Among the cases studied, Al Babtain mosque exhibits the strongest façade-level equilibrium and the most balanced minaret. It also represents a commonly repeated architectural style in Riyadh—two rectangular minarets with standard proportions—which achieves greater visual balance than designs with slimmer, taller, or single minarets. These findings suggest that visual equilibrium is a characteristic feature of Riyadh’s traditional mosque architecture.
4.5. Comparison of Fractal Dimension and Visual Weight
While FD3 shows the closest descriptive alignment with perceived visual complexity in this sample, this is treated as a within-sample tendency rather than a formal scale-selection criterion. [
Table 8] presents a side-by-side comparison of each building’s fractal dimension (FD) and visual equilibrium scores.
Figure 10 and
Figure 11 plot these façade-level indices as point estimates: FD1–FD3 at the three scale bands and the single VW score per façade. As noted in
Section 4, these computational measures are derived from a single box-counting procedure and a single documented partitioning scheme and are therefore treated as deterministic façade-level indices without associated confidence intervals.
Visual-weight results in
Figure 11 are interpreted by the sign and magnitude of the value. A positive value indicates that the building appears visually lightweight, with an upward tendency. A negative value suggests greater visual weight, giving the impression that the building is grounded or “piercing the earth.” Values near zero signify that the building is visually balanced, i.e., in equilibrium. Sample 5 is perceived by respondents as the lightest, while Sample 3 is the heaviest; samples 1 and 3 are the most balanced in terms of VW, exhibiting values closest to visual equilibrium.
Looking at the fractal-dimension graph in
Figure 10, FD1 differs from the other two estimates in three of the five cases, whereas FD2 and FD3 are similar in four of the five cases, indicating that FD2 and FD3 are more stable than FD1 across this sample. In the visual-equilibrium graph (
Figure 11), a qualitative similarity is observed between FD1 and VW. The façade-level Spearman correlations between VW and the three FD bands are ρ ≈ 0.07 for FD3, ρ ≈ −0.30 for FD2, and ρ ≈ 0.70 for FD1. Any apparent advantage of one FD band over another in this five-façade sample is therefore interpreted as a descriptive tendency rather than a formal scale-selection rule. Given the small sample (
n = 5), these coefficients are interpreted descriptively as effect sizes rather than as the basis for formal inference, but they suggest that the coarse-scale FD1 band aligns most closely with the part sizes used in the visual-equilibrium calculations.
Future work can examine this relationship by computing visual-equilibrium results at multiple shape-analysis scales. For example, if window patterns are considered and glass frames are counted as separate panels—matching the finer 225 mm box-counting scale—results may show scale-specific correlations between visual complexity and visual equilibrium. Comparison with the visual-preference survey will helps explore which fractal-complexity scale shows the clearest sample-specific correspondence with perceived complexity. More broadly, future studies should test the robustness of FD–preference and FD–VW patterns by experimenting with alternative box-size progressions, different FD band groupings, and FD estimates based on more than two box-counting points per band.
4.6. Visual Preference Survey Approach
If the mosque images in the survey were 2D elevation drawings identical to those used for computation, they would feel unrealistic and less relatable to respondents. Conversely, if we used photographs, they would not be directly comparable to the 2D elevations. Therefore, we adopted 2D rendered images as a compromise—striking a balance between realism and comparability. A similar image type was used in a visual-preference survey on Chinese courthouse architecture [
41].
The visual-preference survey serves as an external criterion to explore how two computational approaches align with expert perception in this specific set of façades, rather than to establish definitive validation. The survey targeted architects, visual-design specialists, architecture academics, design enthusiasts, and architecture students, all evaluating the same five mosque façades in Riyadh that were analyzed computationally. The survey items and 0–10 scale anchors were drafted based on previous visual-preference studies in environmental and architectural design and were iteratively reviewed and refined by the authors to ensure clear definitions of ‘visual complexity’ and ‘visual balance’ and a transparent interpretation of the midpoint (5) as perceived equilibrium. In this way, the survey acts as a perceptual anchor, providing an exploratory check on how FD- and VW-based metrics align with expert impressions within this sample.
To maximize interpretability and reliability, we used two types of questions:
Responses are counted and multiplied by the numerical scores above and later correlated with the scale ratings to assess response consistency. Presenting these definitions before the scale questions help respondents anchor their subsequent ratings. At the same time, we acknowledge that presenting these definitions and guided categories immediately before the rating tasks may also prime respondents toward the constructs used in the computational metrics; as a result, any survey–metric correspondences should be interpreted as convergent rather than fully independent validation.
- 2.
0–10 scale ratings (primary measures)
Participants then rated each façade on two 0–10 scales:
Visual complexity: 0 = very simple, 10 = highly detailed/complex, 5 = balanced.
Visual equilibrium (visual weight): 0 = very light/upward, 10 = very heavy/grounded, 5 = equilibrium.
These scale ratings are the primary survey outputs used to compare against fractal dimension and visual equilibrium calculations.
The full survey instrument, including instructions, rendered façade images, multiple-choice questions, and 0–10 rating scales, is provided as
Supplementary Material S1.
4.6.1. Results Interpretation
To check internal consistency, we computed Pearson’s correlation coefficient between the two question types (multiple-choice and 0–10 rating scales) for each construct. The internal correlation between the two survey formats for equilibrium is very high (r ≈ 0.99, n = 5, p < 0.01), providing a descriptive indication of excellent consistency in how respondents applied the balance concept, rather than a formal hypothesis test. We then correlated the survey outputs with the computational results to address the study aims. For the façade-level associations among FD, VW, and façade-level mean ratings (n = 5), we report Spearman’s ρ as descriptive effect sizes without p-values or multiple-comparison corrections, given the very small sample size.
The average results from the scale questions were correlated with the numerical scores from the multiple-choice. A higher correlation is interpreted as survey accuracy, while lower correlation is interpreted as survey respondents’ confusion.
As a further accuracy check, we correlated the mean rating-scale scores with the numerically coded multiple-choice counts. A higher correlation indicates clearer respondent understanding (greater survey accuracy), whereas a lower correlation suggests possible confusion or construct misinterpretation.
Visual complexity. For comparability with fractal dimension (FD) values, rating-scale results were linearly mapped to the FD range [
1,
2]. Specifically, the average scale score
Thus, a scale score of 10 becomes 2.0 and a score of 0 becomes 1.0, aligning the survey outputs with the conventional FD range [
11]. Because this is a simple linear, monotone transformation of the 0–10 ratings, it preserves the rank ordering of façades and therefore does not change any Spearman correlations; it is used solely to place the survey-based curve on the same vertical axis as the FD values for visualization. The mapped survey values were then correlated with FD1, FD2, and FD3. Given the small
n and exploratory nature of the study, weak correlations are interpreted simply as weak associations, without attributing error specifically to either the survey or the FD measure.
Visual equilibrium: Larrosa’s logarithmic formulation treats the principal/ascending force as positive and the complementary/descending (weight) as negative; hence, negative sums indicate heaviness, and positive sums indicate lightness. In the survey, however, the 0–10 scale was anchored so that higher numbers = heavier appearance, in order to avoid confusion about the sign convention. To align the survey outputs with the computational metric and center the scale at equilibrium, we applied two adjustments: (i) recenter at 5 (equilibrium midpoint) and (ii) flip the sign so that heavier becomes more negative. Empirically, computed VW values range approximately from −50 to +50, so we also scaled the survey scores to this range. For a rating-scale score (y), the mapping is:
This transformation yields a survey-based equilibrium metric directly comparable to the objective VW, This linear, monotone transformation yields a survey-based equilibrium metric directly comparable to the objective VW, without altering the rank ordering of façades or changing which samples are perceived as balanced, heavy, or light; in particular, it does not affect the Spearman correlations and serves solely to place the survey scores on a comparable axis for visual comparison with the VW calculations. As a simple sensitivity check, replacing the multiplier −50 with −1 produces the same Spearman correlation (ρ ≈ 0.85, n = 5) and identical façade ranking, confirming that this factor only rescales the vertical axis.
4.6.2. Visual-Complexity Survey Results
Visual-complexity survey outcomes are summarized in
Table 10 and correlated with the computed fractal-dimension values in
Table 11. FD3 shows a weak-to-moderate positive association with perceived complexity (r ≈ 0.43,
n = 5), which we treat as an exploratory pattern within this small sample. FD1 and FD2 correlations with perceived complexity are close to zero, indicating little evidence of alignment at the coarser scale bands in this sample. The relationships between the mapped survey scores and FD1/FD2/FD3 are shown in
Figure 12. The highest association is observed for FD3, as highlighted in
Figure 13, indicating that the finest scale band (FD3) aligns most closely with participants perceived complexity.
When interpreting
Figure 13, FD3 exhibits the strongest correspondence with subjective impressions. Nonetheless, Samples 2 and 5 show notable divergences. In Sample 2, arches lower the calculated FD at coarse scales, yet respondents judged the façade as visually complex—likely due to repetition of arches and perceived construction intricacy. In Sample 5, the façade’s ornamentation yields a higher FD, but many respondents perceived the overall composition as simple, possibly because its rectilinear order and traditional layout cue simplicity despite local detail.
Key observations:
Visual complexity: perception vs. measurement: The discrepancies between survey responses and calculated fractal dimensions (FD) suggest that respondents may interpret “complexity” differently from how FD operationalizes it (edge density across scales). To reduce ambiguity, provide a brief primer with one visual example (low-detail planar façade vs. fine-grained patterned façade) and a single comprehension check before the survey. This clarifies that FD reflects multi-scale edge structure rather than ornament meaning or style.
Factors influencing survey responses: The mention of confusion or lack of visual capability among respondents underscores the multifaceted nature of factors influencing perceptions. Age, cultural background, and expertise in art or architecture are all potential variables shaping individuals’ evaluations of visual complexity. Additionally, environmental factors such as lighting conditions or presentation format could impact responses. For instance, viewing images on a computer screen versus in-person may lead to varying perceptions, highlighting the need for controlled experimental conditions in future studies.
Role of fractal dimension in architectural analysis: The weak-to-moderate positive association observed between FD3, and survey results suggests the potential utility of FD analysis as a tool for understanding human perception of complexity in architectural forms. This finding implies that certain architectural features or design principles may influence fractal dimension calculations and their relationship to human perception. Further research could explore these dynamics in greater detail to elucidate the underlying mechanisms driving perceptions of complexity.
Interpreting discrepancies in specific cases: The notable discrepancy observed in the perception of the Alfaris mosque highlights the intricate nature of architectural perception. Unique architectural features may confound traditional measures of complexity, necessitating nuanced interpretation. Qualitative methods such as interviews or focus groups could provide valuable insights into the specific aspects of design that influenced participants’ perceptions. Understanding these discrepancies is crucial for refining measurement techniques and enhancing the accuracy of architectural assessments.
Implications for design practice: Understanding how people perceive complexity in architecture is vital for designers seeking to create aesthetically pleasing architecture. The findings of this study have significant implications for design decisions aimed at balancing complexity and simplicity to meet users’ preferences and needs. Designers should consider not only objective measures of complexity but also subjective perceptions when creating architectural designs. Iterative testing and feedback could be valuable for refining designs based on user preferences and enhancing overall user experience.
4.6.3. Visual-Equilibrium Survey Results
Survey outcomes for visual equilibrium are summarized in
Table 12, and their associations with the computed visual weight (VW) are reported in
Table 13. The paired trends are plotted in
Figure 14, showing a close correspondence between subjective (mapped survey) and objective (computed VW) measures. The high descriptive correspondence (Spearman’s ρ ≈ 0.85,
n = 5) provides preliminary convergent evidence that Larrosa’s visual-weight estimates track expert judgments of balance for this sample of mosque façades.”. Given the convenience, expert sampling and the conceptual definitions provided to participants, this evidence should be interpreted as initial rather than definitive validation, and further work is needed with more diverse participant groups and tasks that minimize priming.
Numerically, the correlation between the mapped survey scale and computed VW is Spearman’s ρ ≈ 0.85 (n = 5), indicating high descriptive correspondence within this small sample. Given the limited n, this result should be interpreted as exploratory, not as definitive statistical validation. The correlation using the numerically coded multiple-choice responses is r = −0.791; this negative sign is expected because the multiple-choice coding and the VW sign convention are inverted (heavier → larger positive code in the survey, whereas VW treats heaviness as more negative). Taken together, these results provide preliminary convergent evidence for the Larrosa visual-equilibrium calculation in this specific context.
The visual-preference data were obtained from a convenience sample of architecture academics, professional architects, and final-year architecture students, recruited through professional networks. This expert composition and the provision of conceptual definitions for ‘visual complexity’ and ‘visual equilibrium’ (
Table 9) may increase alignment between subjective ratings and the theoretical constructs underlying Larrosa’s model. As a result, the observed high correlations between computed visual weight and survey-based balance should be regarded as upper-bound, expert-sample estimates of convergent validity, not as definitive proof that Larrosa’s method captures balance perceptions in the general population. Future research should replicate the study with lay participants and alternative tasks (e.g., pairwise comparisons without explicit definitions) to test generalizability and to further probe the robustness of the model.
Interpreting
Figure 14. The overall pattern supports the calculation method, with one notable divergence at Sample 4. Respondents perceived Sample 4 as heavier than the computed VW suggests. This is plausibly due to the façade’s slim vertical openings, which can visually signal heaviness to observers; in the computation, however, those openings contribute opposite-signed (lightening) terms that partially offset the dominant horizontal elements, yielding a value nearer to equilibrium. Aside from this case, the trajectories of the two curves are consistent. Overall, the correlation between subjective evaluations and the calculated results affirms both the reliability of human perception in assessing visual equilibrium and the accuracy of Larrosa’s visual weight calculation method.
Key observations:
Internal survey consistency: The strong within-survey agreement (rating scale vs. multiple-choice; see
Table 12 and
Table 13) indicates that respondents applied the visual-weight concept consistently across question formats. This suggests that the survey effectively captured participants’ perceptions of visual weight, providing valuable insights into their assessments of the prominence of architectural elements within surveyed structures.
Objective–subjective correspondence: The high descriptive correspondence between computed VW and mapped survey results (Spearman’s ρ ≈ 0.85, n = 5) provides preliminary convergent evidence that Larrosa’s method can act as a quantitative proxy for perceived balance in this sample of façades. As a simple sensitivity check, we recomputed this association using the mapping (y − 5) × (−1) instead of (y − 5) × (−50); the Spearman correlation and the ranking of façades were identical, indicating that the scaling factor only rescales the plotted values and does not influence the rank-based correspondence. In line with this, we treat the mapped survey scores as a visualization aid that place subjective and computed indices on comparable axes, rather than as an additional source of statistical evidence.
Case-level nuance (Al Babtain vs. Awaidhah): In Al Babtain, perceived equilibrium closely matches the computation (near zero), illustrating strong convergence. In Awaidhah, the perceived heaviness exceeds the computed value—likely reflecting how observers weigh continuous horizontal bands and narrow vertical perforations.
Perceptual emphasis differs from complexity: Several respondents appeared more certain about either complexity or weight for a given façade. This suggests that complexity (detail across scales) and equilibrium (up–down balance) tap distinct judgments. Designers and researchers should treat them as complementary, not interchangeable, constructs.
In conclusion, the strong agreement between computed visual weight and surveyed visual equilibrium—with a single, interpretable outlier—may provide a useful quantitative proxy for perceived balance in this specific context. Incorporating confidence intervals, multi-operator partitioning and inter-rater agreement tests, part-level sensitivity checks, and case notes (as above) will further strengthen the evidential base for the method’s validity.
5. Conclusions
This study triangulated fractal dimension (FD), visual equilibrium (Larrosa’s visual weight, VW), and a visual-preference survey to characterize the visual qualities of a small set of Riyadh mosque façades. Across the five cases, FD values for the third (finest) iteration (FD3) fell between 1.203 and 1.547, indicating generally low visual complexity within this sample. Most samples lie near the 1.3–1.5 band often associated in the literature with calming, “stress-releasing” patterns, which is consistent with restrained, Najdi-influenced vocabulary of these large main-road mosques.
Within this limited set of façades, the high descriptive alignment between computed VW and survey-mapped equilibrium (Spearman’s ρ ≈ 0.85, n = 5) provides preliminary convergent-validity evidence that Larrosa’s method can act as a quantitative proxy for perceived balance in this context. Minarets were typically near-balanced elements; paired, separate minarets tended to help maintain whole-façade equilibrium, especially when counterweighted by the broader prayer-hall mass. In the Al Babtain example, equilibrium was close to zero with slight lightness, consistent with its canonical Najdi proportions as perceived by the expert sample.
For visual complexity, correlations with subjective judgments were weak to moderate overall, but highest for FD3 (≈0.40), suggesting that finer-scale detail (openings, frames) is most salient to observers. Divergences in specific cases (e.g., arches perceived as complex despite moderate FD; ornamented yet “simple” façades due to strong global order) indicate that repetition and overall organization modulate perceived complexity beyond what FD at a given scale capture.
As a result, patterns such as lower visual complexity and the relative balance of the traditional twin-minaret façades in this sample are best interpreted as within-sample tendencies and testable hypotheses, not universal properties of mosque architecture. A tentative rule of thumb, for façades of comparable style and composition, is that targeting FD ≈ 1.3–1.5 with VW ≈ 0 yields visually calm yet balanced compositions; moving toward FD ≈ 1.5–1.6 may increase perceived liveliness provided that equilibrium is maintained.
These conclusions should be viewed in light of two main limitations: (i) the small, purposive set of five large main-road mosques in a single city, and (ii) the expert convenience sample (architects, academics, and advanced students), which likely provides upper-bound alignment between computational metrics and perception. As a result, patterns such as lower visual complexity and the near-equilibrium balance observed in some twin -minaret façades are best interpreted as within-sample tendencies and testable hypotheses, not universal properties of mosque architecture or of twin-minaret designs in Riyadh.
Future work should (i) where sample sizes allow, report confidence intervals with all effect sizes, (ii) explore scale-matched equilibrium (partitioning at finer part sizes to test FD1/FD2/FD3 against corresponding VW), (iii) add feature-specific FD (e.g., windows-only) and simple proxies for repetition and global order, and (iv) broaden to neighborhood mosques, historic fabric, and other Saudi and non-Saudi cities, including lay participants, to test cross-typology and cross-cultural generalizability. Because Larrosa-based equilibrium values in this study depend on a single documented partitioning protocol implemented by one primary coder, the VW findings should be regarded as preliminary and method-development oriented rather than as fully validated across alternative segmentation choices. A priority for future work is to repeat the Larrosa calculations with multiple independent partitioners, quantify inter-operator agreement, and test the sensitivity of VW outcomes to alternative, yet reasonable, façade segmentation strategies. In addition, subsequent studies should test whether the observed FD–preference and VW–survey correspondences remain stable under alternative (including non-linear) mapping schemes for the rating scales to confirm that these patterns are not artifacts of rescaling. Short, guided primers in surveys may also help reduce ambiguity around “visual complexity” and “visual balance.” This triangulated framework is designed to be adaptable to other building types and urban contexts. In all such applications, FD and VW are intended to serve as transparent, discussable indicators that support design and review conversations, while leaving final judgments about visual quality with human decision-makers.