Bone Morphogenetic Proteins, Carriers, and Animal Models in the Development of Novel Bone Regenerative Therapies

Bone morphogenetic proteins (BMPs) possess a unique ability to induce new bone formation. Numerous preclinical studies have been conducted to develop novel, BMP-based osteoinductive devices for the management of segmental bone defects and posterolateral spinal fusion (PLF). In these studies, BMPs were combined with a broad range of carriers (natural and synthetic polymers, inorganic materials, and their combinations) and tested in various models in mice, rats, rabbits, dogs, sheep, and non-human primates. In this review, we summarized bone regeneration strategies and animal models used for the initial, intermediate, and advanced evaluation of promising therapeutical solutions for new bone formation and repair. Moreover, in this review, we discuss basic aspects to be considered when planning animal experiments, including anatomical characteristics of the species used, appropriate BMP dosing, duration of the observation period, and sample size.


Introduction
Bone tissue possesses unique regenerative properties, and bone fractures regularly heal under physiological conditions. However, large segmental bone defects resulting from severe trauma or extensive tumor resection cannot be restored by endogenous selfrepair mechanisms, decrease quality of life, and may sometimes lead to limb amputation. Indeed, the management of large segmental defects is one of the most challenging issues in orthopedic medicine, typically due to the biologically hampered microenvironment [1,2]. The standard of care for the healing of large bone defects requires the use of an autologous bone graft (ABG), which is usually harvested from the iliac crest. ABG is also used as a gold standard to achieve spinal fusions, including posterolateral spinal fusion (PLF). PLF is a commonly performed surgical procedure used for the treatment of degenerative diseases of the spine, including degenerative disc disease, spondylolisthesis, spinal instability, and symptomatic scoliosis [3][4][5][6].
However, ABG possesses several disadvantages, including a limited amount of bone that might be harvested, the potential transfer of contaminating agents, acute and chronic pain, skin scarring, and deformity at the donor site [4,7]. In addition, the use of ABG increases the blood loss, duration, and cost of surgical procedures. Therefore, there remains an imminent need for the development of novel bone regeneration strategies to enrich or replace ABG. Among these, osteoinductive devices are under investigation for clinical use in PLF and healing of large segmental long bone defects.
Inorganic materials as potential BMP carriers include calcium phosphate (CaP) ceramics, calcium phosphate, calcium sulfate cement, and bioglass [2,5,29,32,38,42,. The most commonly used inorganic preclinical materials are CaP ceramics, further subdivided into hydroxyapatite (HA), tricalcium phosphate (TCP), and biphasic calcium phosphate (BCP) containing both HA and TCP at various ratios. We have recently shown that the chemical composition of ceramics does not affect the amount of newly formed bone induced by the osteoinductive device [73,74]. However, HA and TCP significantly differ in resorbability (HA is very stable, while TCP is more resorbable), which would eventually result in different residual ceramic volumes. The resorbability might be adjusted by varying HA/TCP ratios in BCP ceramics [75]. Moreover, CaP ceramics might be formulated into particles or blocks in a broad range of sizes and geometrical shapes while porosity, pore size, and interconnectivity are adjusted during the sintering process [73,75,76]. We demonstrated that particle size affects the volume of newly formed bone; smaller particles (74-420 µm) combined with rhBMP6 resulted in higher bone volume than larger particles (1000-4000 µm) [73]. Another important determinant of ceramics is the pore size since pores from 300 to 400 µm promoted the formation of the largest bone volume [51].
The fourth group of BMP carriers are composites of the aforementioned materials which have been introduced to overcome the encountered limitations of a single component. The most typical combinations are composites containing either natural or synthetic polymers with CaP ceramics [39,[77][78][79][80][81][82][83][84][85]. In these combinations, ceramics increase the biomechanical properties of the implants and are used to address compressibility issues. Less frequent, natural, and synthetic polymers might be combined.
We have recently developed an autologous bone graft substitute (ABGS) comprised of BMP6 delivered within an autologous blood coagulum to which a compression-resistant matrix, such as allograft or synthetic ceramics, can be added [22,73,74,76,[86][87][88][89][90][91][92]. Moreover, the volume of newly induced bone increased with the elevation of the CRM amount, which might be attributed to the enlargement in an overall surface area [73].

Animal Models
Animal models are routinely used in the development of novel bone regenerative therapies [8]. Models might be categorized according to the species (mouse, rat, rabbit, sheep, non-human primate) and tested indication (ectopic model, critical-size defect, PLF). In this review, we suggested classification based on the stage of preclinical development, namely as initial, intermediate, and advanced testing of osteoinductive devices ( Figure 1). Initial testing includes rodent ectopic and rodent critical-size defect models for rapid comparison of different osteoinductive responses. Intermediate evaluation includes adequate rabbit models (segmental defect and PLF model), while advanced testing uses canine, sheep, and non-human primates as a final step before clinical trials.

Ectopic Models
Rodent ectopic models have been extensively used for the initial evaluation of novel osteoinductive therapies. They might be also used for investigating the biology of ectopic bone induction and the formation of a bone organ or ossicle, including bone and bone marrow [31,32,39,[48][49][50][51][52][53][54][55][56][57]71,73,76,86,87,[93][94][95][96][97][98][99][100][101][102][103][104][105]. Rodent ectopic models (Tables 1 and 2) are further subdivided according to the species (mouse, rat) and the implantation site (subcutaneous or intramuscular). Implantation under the skin (Figure 2A-D) or into the muscle does not affect the bone formation outcome, and the bone formation occurs in the first two weeks following implantation of an osteoinductive device [76,86,87]. The later time points are needed for the evaluation of the bone longevity and maintenance of the ectopic bone structure. Molecular and cellular events during the cascade of bone formation can be evaluated using microCT/nanoCT and histological analyses. Immunohistochemistry, flow cytometry, gene profile microarrays, and single-cell RNA sequencing are among other analytical techniques used for unraveling the mechanism of ectopic bone formation.

Bone Defect Models
Mouse or rat bone defects are the initial orthotopic models to evaluate the osteoinductive properties of novel therapies and the osseointegration of newly formed bone with adjacent native bone. There are two main bone defect models in rodents: a calvarial critical-size defect and segmental femoral defect. In the calvarial critical-size defect, circular bone defects are created in the mouse (3-5 mm) [109][110][111][112][113][114][115][116][117][118] and rat (4-8 mm) [29,55,[58][59][60]66,77] calvaria ( Figure 3A; Tables 1 and 2), while segmental defects of the long bone are typically created in the femur, both in mice (2-3 mm) [119][120][121] and rats (6-10 mm) [5,[33][34][35]67,78,79] ( Figure 3B; Tables 1 and 2) and filled with tested osteoinductive material. The development of a reproducible non-union model in the mouse is demanding, and, in contrast to rat non-union models, mouse non-union models are sparse [122]. The main shortcoming of this model is a relatively small defect size compared to clinically relevant proportions. Moreover, it is difficult to obtain a full stabilization of the fracture, therefore resulting in increased callus formation. Methods of evaluation include analyses of radiological images (CT/microCT), histological and histomorphometric analyses, and biomechanical testing, which might be conducted at the end of the observation period ( Figure 4)  material. The development of a reproducible non-union model in the mouse is demanding, and, in contrast to rat non-union models, mouse non-union models are sparse [122]. The main shortcoming of this model is a relatively small defect size compared to clinically relevant proportions. Moreover, it is difficult to obtain a full stabilization of the fracture, therefore resulting in increased callus formation. Methods of evaluation include analyses of radiological images (CT/microCT), histological and histomorphometric analyses, and biomechanical testing, which might be conducted at the end of the observation period ( Figure 4) Tables 1 and 2. E-ectopic model, C-calvarial, and Ffemoral defect.  Tables 1 and 2. E-ectopic model, C-calvarial, and F-femoral defect.

Duration of the Observation Period
The bone induction process is significantly faster in small animals compared to higher species animals and humans. Therefore, observation periods were significantly shorter in rodent ectopic and bone defect models than in studies on sheep and non-human primates (Figures 4 and 5, 2nd row) (Figure 4, 1st and 2nd  columns, 2nd row). However, depending on the purpose of the study, observation periods

Posterolateral Spinal Fusion (PLF) Model
Rabbit is the most commonly used species for the evaluation of the efficacy and safety of promising therapeutical solutions for achieving PLF [5,6,37,38,40,[62][63][64]74,85,86,[127][128][129]. The transverse processes of lumbar vertebrae are exposed, and an osteoinductive device is implanted bilaterally between adjacent transverse processes (L4-L5 or L5-L6) [127]. Transverse processes should be decorticated before the implantation to promote osseointegration of newly formed bone with native bone [86]. In the majority of previous studies, the BMP dose was up to 1000 µg and was delivered on various carriers ( Table 4). The spinal fusion outcome was evaluated 6 weeks following surgery, and the majority of rabbit PLF studies had an observation period of fewer than 10 weeks. Few studies had a prolonged observation period (>10 weeks), but later time points might be important to determine the survival and long-term maintenance of newly induced bone [6,86], which is clinically relevant in patients undergoing PLF surgery ( Figure 5, 1st column).

Dog and Sheep Segmental Defect Model
Dog and sheep segmental defect models are used for advanced evaluation of bone regeneration therapies. In dogs, the defect (20-25 mm) is created in the radius or ulna ( Figure 3C) [130][131][132][133][134][135]. Applied doses of BMPs were in the range between 100 and 650 µg, which is higher compared to the rabbit model. Moreover, the typical observation period (12-24 weeks) was also prolonged compared to the rabbit model ( Figure 5, 2nd and 3rd columns).
Tibial segmental bone defects in sheep ( Figure 3D) were recently developed to evaluate novel bone regeneration therapies in conditions mimicking the size and biology of segmental bone defects in the clinics [2,[136][137][138][139][140]. Moreover, there are two subtypes of this model: a fresh defect (FD) and biologically exhausted defect (BED), the latter mimicking a patient with a non-union [2]. Following the creation of a large defect (30 or 45 mm) in the sheep tibia in the FD model, a polymethyl-methacrylate spacer is inserted to induce the formation of the Masquelet membrane. Six weeks following the creation of the defect, an osteoinductive device was inserted after the removal of the spacer (FD model). In the BED model, the defect is in the first instance left untreated leading to a non-union. Subsequently, debridement of the non-union or fibrotic tissue ingrowth (BED model) is performed, followed by implantation of a spacer for 6 weeks, and then, finally, after removal of the spacer, an implant is inserted. BMP doses applied in this model ranged from 344 to 3800 µg, while the typical observation period was up to 16 weeks [2]. Although the osteoinductive device containing BMP6 on a carrier achieved bridging in FD (30 mm), it was found that larger, biologically exhausted defects appear to require a cell-based implant together with BMP to achieve proper clinically relevant bridging (Table 5) [2]. Importantly, defects were mechanically well stabilized with a circular external fixator according to the Ilizarov technique.

Sheep PLF Model
The sheep PLF model is highly translatable to clinics because the size of the lumbar vertebrae of the sheep is comparable to humans. However, only a few preclinical studies have been conducted on this model [3,65,88] (Table 6). Sheep PLF may be conducted at a single level or as a multisegmental procedure. Moreover, it may be performed with or without instrumentation [88]. The observation period and applied BMP doses in this model were typically significantly longer/higher than in studies on small animals: the follow-up period was up to 6 months with a BMP amount up to 10 mg ( Figure 5, 3rd column). Methods of evaluation included X-ray monitoring, microCT evaluation, histological analyses ( Figure 2H), and biomechanical testing [3,65,88].

Non-human Primate (NHP) PLF Model
Non-human primates are the most similar animal species to humans, both anatomically and genetically. However, only a few studies were conducted using NHP PLF [14,63,141] ( Table 6), primarily due to ethical and economic reasons. In these studies, the goal was to achieve a single-level fusion between adjacent lumbar transverse processes, which are anatomically similar to humans. The applied BMP2 doses (3-12 mg), as well as observation period (24 weeks), were comparable to the sheep PLF model ( Figure 5, 4th column).

PLF
Posterolateral spinal fusion (PLF) in preclinical studies is conducted in the lumbar portion of the spine. Although the basic anatomical features of lumbar vertebrae are similar among species discussed in this review, they differ in size and proportions of the different parts of the vertebrae. Rabbits ( Figure 3E) and sheep ( Figure 3F) have long transverse processes compared to the size of the vertebral body, while humans ( Figure 3G), as an adaptation to erect posture and bipedal locomotion, have large bodies and short transverse processes. Importantly, transverse processes in rabbits are slanted and oriented anteriorly ( Figure 2E or Figure 3E). On the other hand, the transverse processes in sheep ( Figure 2G or Figure 3F) and humans ( Figure 3G) are horizontal. The distances between the transverse processes are relatively short in rabbits (20-30 mm), while they are comparable in sheep and humans (40-50 mm).

Sample Size
Defining an appropriate sample size is a prerequisite for obtaining valid conclusions from each study. Moreover, the appropriate size of the sample is affected by several parameters, including experimental design and purpose of the study as well as expected differences among experimental groups. The sample size in the majority of reviewed studies here was 5-10 per group regardless of the animal species or model (Figures 4 and 5, 3rd row). Moreover, there is a consensus in published work that the minimal number of animals per experimental group is four. However, a few animals might die during surgery or follow-up periods due to reasons non-related to the tested osteoinductive therapy; therefore, at least five animals per experimental group should be included.

Study Outcomes
In Tables 1-6, it was not possible to describe the study outcomes due to non-comparable scoring grades for healing or spinal fusion experiments. The prerequisite in reporting the outcome of bone defect and spinal fusion studies is a clearly described success rate as the percentage of successfully rebridged defects or fused spine segments, respectively. Moreover, the method (radiological images, mobility testing) used to determine rebridgment/fusion should be clearly described. Surprisingly, in a large number of published studies, the success rate was not explicitly described. Several authors used their own scoring grades instead of standardized binary outcomes (successful or unsuccessful rebridgment/fusion). However, even when the binary outcome was used, the determination of successful rebridgment/fusion differed among authors. For example, a few authors determined success rate only on X-ray images without microCT, histology, and biomechanical testing. We suggest that successful rebridgment/fusion should also be determined with microCT, histological sectioning, and biomechanical testing.
The experimental outcome of osteoinductive therapies using rodent ectopic models should be determined by microCT and histology. MicroCT analyses provide information on newly formed bone volume expressed as bone volume (BV) or bone volume/tissue volume ratio (BV/TV). Additionally, if the tested osteoinductive device contains ceramics, microCT analyses might be used to determine the amount of residual ceramic matrix. Moreover, microCT analyses allow the determination of structural properties of newly formed bone by calculating trabecular parameters (trabecular number, trabecular thickness, trabecular separation). The structural properties of newly induced bone should also be analysed by histology and histomorphometry to determine the volume of the bone and remaining carrier/matrix.

Conclusions
Due to the large socioeconomic burden of degenerative diseases of the spine and segmental defects of long bones, there is an imminent need for the development of novel osteoinductive therapeutic solutions [1,22]. However, until now, none of the osteoinductive devices have been approved for use in PLF and large segmental defects in patients. A broad range of bone regeneration strategies have been proposed and tested in different animal models. A vast majority of these studies have been conducted in rats and rabbits, leading only to the initial and intermediate steps of preclinical testing, and despite claiming positive results, only a few have been further tested in sheep and NHP models. Infuse™, a BMP2-containing osteoinductive device, has been approved for use in ALIF and acute tibial fractures but has also been used in various off-label indications. However, numerous side effects related to high BMP dose and a large release from the bovine collagen as a carrier have been reported. Therefore, there is a need for an osteoinductive device that would be efficacious at lower doses of BMP delivered on a carrier with a prolonged BMP release. There is some hope that novel engineered BMPs or innovative delivery systems for BMPs may reduce the required therapeutic doses. A novel ABGS containing rhBMP6 within autologous blood coagulum was evaluated in preclinical studies, and in exploratory clinical trials (high tibial osteotomy, distal radial fracture, and posterolateral interbody fusion), it was proven safe and efficacious at relatively low BMP6 doses [73,74,76,[86][87][88]91,92].