# Mining Road Traffic Rules with Signal Temporal Logic and Grammar-Based Genetic Programming

## Abstract

## 1. Introduction

## 2. Related Work

## 3. Background: Signal Temporal Logic

## 4. Problem Statement

## 5. Methodology

#### 5.1. Evolutionary Algorithm

- It builds the offspring population ${P}^{\prime}$, with $|{P}^{\prime}|={n}_{\mathrm{pop}}$, by iteratively selecting one (mutation, with $1-{p}_{\mathrm{xover}}$ probability) or two (crossover, with ${p}_{\mathrm{xover}}$ probability) parents chosen with tournament selection of size ${n}_{\mathrm{tour}}$ and then applying the genetic operator. If the resulting solution ${\phi}_{c}$ is already part of the offspring ${P}^{\prime}$ or parent population P, a new solution is generated, and the process is repeated for a maximum number of ${n}_{\mathrm{atts}}$ attempts; otherwise ${s}_{c}$ is added to ${P}^{\prime}$ and its fitness $f\left(\phi \right)$ is computed.
- It merges the parent and offspring populations ${P}^{\prime}$ and P.
- It shrinks the resulting new population P, until its size is ${n}_{\mathrm{pop}}$, by iteratively removing the worst solution.

#### 5.2. Fitness Function

#### 5.3. Grammar for STL Formula Structures

## 6. Experimental Evaluation

- Can we mine specifications that describe the input unlabeled trajectories?
- Are the mined specifications readable and interpretable for a human?

#### 6.1. Data

#### 6.2. Data Processing

#### 6.3. Results

#### 6.3.1. RQ1: Solutions That Are Effective

#### 6.3.2. RQ2: Specifications That Are Readable and Interpretable for a Human

- to poise the distances from the neighbors, and
- to drive neither too fast nor too slow.

## 7. Conclusions and Future Work

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Conflicts of Interest

## References

**Figure 1.**A derivation tree of the grammar of Figure 2 for the formula $({a}_{1}<r){\mathrm{S}}_{[{t}_{1},{t}_{2}]}\neg ({a}_{2}>r)$.

**Figure 2.**The CFG for describing STL formula structures. Non-terminal symbols are enclosed in angle brackets: the topmost non-terminal symbol, $\langle \mathrm{formula}{}_{}\rangle $, is the starting symbol ${s}_{0}$ of the grammar. The derivation rules for the symbols $\langle \mathrm{formula}i{}_{i}\rangle $, $\langle \mathrm{logic}i{}_{i}\rangle $, $\langle \mathrm{temp}i{}_{i}\rangle $ are parametric on i, which represents the nesting level. The derivation rule for $\langle \mathrm{attr}{}_{}\rangle $ is the one that makes the grammar tailored to a given system with attributes $A=\{{a}_{1},{a}_{2},\cdots ,{a}_{\left|A\right|}\}$.

**Figure 3.**Sample frame reproducing the traffic of the dataset [29]. Each colored box represents a car. Dotted lines are lane separators, while solid lines are guardrails. The two segments projecting out from the first level of the road are the boundaries of the on-ramp. The second level of road is the continuation of the top one, while the red shaded rectangle is the range for the trajectory endpoints.

**Figure 4.**A car and its eight neighboring regions. Regions are labeled using cardinal directions. Boundaries can be swiftly computed starting from the $x,y$ positions of the front-left corner of the car, using car width and car height (provided in the dataset).

**Figure 5.**Distribution of the robustness $\rho (\phi ,\mathit{x},t)$, computed for all the I-80 trajectories $\mathit{x}$, for the best individual $\phi $ found in each run.

**Figure 7.**Number of occurrences of operators and attributes for the best individual of each evolutionary run.

**Table 1.**Fitness f, solution (derivation tree) size $\left|\phi \right|$ for the best individuals found in each run, and evolution time in seconds. Reported as median ± standard deviation.

f | $\left|\mathit{\phi}\right|$ | Time [$\mathbf{s}$] |
---|---|---|

0.063 ± 0.009 | 52.5 ± 6.5 | 3876.6 ± 426.4 |

