Once workpiece observations are made, the inference module identifies possible events and tracks possible future positions of workpieces based on those events. For our work, any of the following can be considered as an event:
The inference algorithm is modeled after simple reasoning employed when a human monitors inventory through a camera. If the workpiece is directly visible within the camera’s view, then we look no further since we exactly know where it is. However, if a workpiece is missing from its last seen location, then one would tend to go through the video feed and:
The intuition described above is also our motivation to build a graph-based model to identify and track events in a probabilistic manner. Factors such as proximity and temporal data can be used to estimate the strength or confidence of events between any two objects. These form simple building blocks but translating the effect of such events when there are multiple interactions between multiple workpieces is difficult to track. Graphs have been successfully used to model such systems. Simple building blocks can be used to establish relations, as edges, between any two workpieces, represented as nodes. These simple relations are then used to build more complex relations or dependencies between all the workpieces to track the effects through paths, reachability and connectivity between nodes within the graph-based model.
4.1. Building Graph of Events between Workpieces
The event graph models the probability of an event occurring between any two workpieces for all available workpieces observed within the system. All observed workpieces
(
) are represented as nodes and the probability of an event
between any two nodes
is represented as edge
in the undirected graph
(Figures 5–11). Since an event in this system is described as an interaction, its probability of occurrence is dependent on two factors; proximity between the two workpieces and duration of time the workpieces were within and out of each other’s proximity. This and can be formulated as Equation (
1).
such that
increases if
and
are within each other’s neighborhood and
at least one of
and
is visible and
decreases if
and
are out of each other’s neighborhood and
both and
are visible. Since the edge strength is representative of the probability of an event between
and
, we have
.
where,
| is the function parameter for h(.) |
| represents time |
| is the edge weight between and at time t in the graph . |
The requirements above are such that we want the edge weight between the two nodes and to increase with time when any two workpieces are observed close to each other as they are more likely to get stacked. At the same time, we want to decrease when we observe both and far from each other as there is diminished chance of them getting stacked or interact with each other. Apart from proximity, time spent in proximity is also factored into as there is time cost involved when workpieces are getting stacked. Without accounting for time spent in proximity, situations where workpieces are momentarily passing by, physically close to each other, will tend to trigger events which in turn increases the search space. While there are a variety of ways to encode these behaviors in , our approach here is to model as a cumulative distribution function over an exponential distribution, as described in Algorithm 1.
Algorithm 1: Building event graph . |
|
where,
| Elapsed time since last observation for workpiece i |
| User defined time threshold to set the state of a workpiece as missing. This is set to two seconds in our experiments. |
| Rate parameter for an exponential distribution to decide weight decay within a workpiece’s neighborhood (user set parameter). Smaller values of has wider neighborhoods with softer weights translating to slower rate of increase in the linger counter within the neighborhood. Higher values have smaller neighborhoods with sharp weights within, translating to sharp increase in linger counter when workpieces are within the tight neighborhood. |
| Weight attributed to proximity factor modeled as an exponential decay for lower values at larger distances between workpieces |
| Linger counter to increase or erode potential for an event with time. Longer a workpiece is observed within proximity (lingers), higher the value. Range is clipped as to keep the value bounded. |
| Rate parameter to convert cumulated linger values to a probabilistic estimate through an exponential cumulative distribution function. Smaller values for translates slower rate of increase in event potential whereas a higher value produces a higher rate of increase. The values are determined by the user depending on the workpiece movement character. Note that rate at which the event potential reaches 1.0 can be controlled with both and |
After an event, a workpiece can be in any of the following states:
In all the above cases, we observe the following:
We lose line-of-sight for all workpieces that get stacked upon.
The workpiece on top of the stack is the only workpiece that can be directly observed and tracked.
A stack might get dispersed, shuffled or split into other stacks while not having a line-of-sight for all its members.
In all the above cases, there exists a dependency between the occluded workpiece and other workpieces forming a stack. Since stacks can cumulate and be dispersed, shuffled or split without direct observation of the process, we preserve chained dependencies between the members involved. These dependencies are tracked by the dependency graph .
Once the events are identified and logged into the graph , built as per Algorithm 1, dependencies between workpieces are extracted from it as . The nodes in this graph represent the workpieces, whereas the edges represent the strength or probability of dependence between the nodes. An edge in this case directly represents the system’s belief that the involved nodes are in a stack. Using this graph, for any given workpiece, we can get a list of other workpiece locations it might be stacked at. While builds events for all observed workpieces in both and sets , is only updated for as a recently observed or visible workpiece is expected to be independent of other workpieces. is built as described in Algorithm 2.
Since the edge weights are positive for
the shortest path also takes the path of highest edge probability or strength in
. In our system, we use Dijkstra’s algorithm [
24] to calculate this path. The penalty factor
in Algorithm 2 encodes the confidence for workpieces to be stacked with other workpieces through indirect events (described in
Section 4.2).
Algorithm 2: Building dependency graph . |
|
where,
| Shortest path (based on edge weights) between nodes i and k in . |
| Distance in terms of node separation between i and k. |
| Penalty factor for separation. |
4.2. Native Mechanisms in Our Graph Model
In this section, we demonstrate the native mechanisms in our graph structure that builds and updates events and dependencies between observed and missing workpieces. Our system was tested using observations made from a simulated virtual environment in Gazebo [
25], as shown in
Figure 4. This environment allows us to place and move the AR markers (placeholders for workpieces) and virtual cameras to simulate simple scenarios. Since the cameras were simulated with known intrinsic and extrinsic parameters, no additional calibration step was taken. The camera image shown in
Figure 4b was re-rendered for clarity, as shown in
Figure 5,
Figure 6,
Figure 7,
Figure 8,
Figure 9 and
Figure 10.
Populating and setting the state of workpieces as nodes: As soon as
Workpieces 1–5, shown as AR tags, are identified within an observed image published by any of the cameras, its unique tag, location and timestamps are logged into the system (
Figure 5). Since all the workpieces are directly visible, no dependencies are observed by the system’s dependency graph (
Figure 5c).
Identifying events and building event weights: The distance and time spent within each other’s proximity is constantly observed and the edge strength for possibility of an event is updated at every observation. As shown in
Figure 5b, edge weights
and
are increased as
Workpieces 1 and
3 are within proximity of
Workpiece 1’s neighborhood, increasing potential for an event. In addition, note that the dependency graph, at this point, has no edges among
Workpieces 1–3 as all workpieces are still directly visible making them independent of each other. Edge weight for
is eroded as per Algorithm 1 when
Workpiece 2 is observed away from
Workpiece 1 (
Figure 6b). This demonstrates the diminished confidence in potential for events when members are observed to be moving away from each other.
Dependency for missing or occluded workpieces: When a workpiece does go missing, due to lack of direct observations for a period of time,
looks up
to identify potential events and builds relations between relevant members. In our case, since
Workpiece 3 gets stacked under
1 (
Figure 6a) and since we already identified potential for an event between
Workpieces 3 and
1, a possible dependency between them is established in
. At the same time, we occlude
Workpiece 5 with a random object (
Figure 6a). Since no potential events involving
Workpiece 5 were observed in
, its state is set to missing or occluded with no dependencies (
Figure 6b). However, if other workpieces, such as
Workpiece 4, are observed within the neighborhood of its last known position (
Figure 7), we still treat
Workpiece 5 as being present in that position, but possibly occluded, and increase the potential for an event with
Workpiece 4 (
Figure 7b). This in turn creates a dependency between
Workpieces 4 and
5 (
Figure 7c).
Direct and indirect dependencies: Stacks can further be stacked, shuffled and dispersed during inventory pulls.
tracks all such chained events to appropriately update dependencies. In
Figure 7a,
Workpiece 1 is moved and stacked with
Workpiece 2 creating
in
. However, since
Workpiece 3 was already believed to be stacked with
Workpiece 1, an indirect dependency between
is also built in
(
Figure 7c). Stacking
Workpiece 2 with
Workpiece 4 creates similar chained indirect dependencies
in
(
Figure 8). Note that the higher is the node separation between workpieces, the less confident the system is in that location due to a separation penalty of
.
Breaking dependencies when a missing workpiece is observed: Once previously missing
Workpiece 3, suspected to be in a stack with
Workpieces 1, 5, and 2, is observed (
Figure 9), all possibilities on how the stack might have split needs to be covered. In our case, let us say that the occluded members of the stack were shuffled somewhere along the way. When
Workpiece 3 is observed, the stack is assumed to be split with no direct knowledge of the order of split so
preserves all possibilities of the split wherein missing
Workpieces 1, 2, and
5 could be stacked under either of
Workpiece 3 or
4. The same reasoning is repeated when later
Workpieces 1 and
2 are also directly observable (
Figure 10) after getting split from their respective stacks.
Note that, in
Figure 10c, the system was not sure of the location of
Workpiece 5 and believes that it could be under any one of
Workpieces 1–4. While it may seem exhaustive to search under all other workpieces for
Workpiece 5, the system does preserve the last known position in memory, making it the first location to search for based on generated totem pole (described in Figure 12). In the case
Workpiece 5 was not found at its last known position as in
Figure 10, based on occurred events, our system proposes the other workpieces that interacted with it as search locations. Once
Workpiece 5 was observed again, all workpieces within this scene become independent of each other (
Figure 11).
Getting a totem pole of search locations for a missing workpiece: Once
is available, a totem pole of possible stack locations can be generated for any workpiece
in it. The resulting list is based on normalized weights on the edges of
in
, as shown for
Workpiece 1 in
Figure 12 and
Figure 13 for the frame in
Figure 9. The value of separation penalty
has a direct effect on the totem pole of search locations as it penalizes indirect events based on separation levels. With
, the totem pole of locations (
Figure 12) we get for
Workpiece 1 is quite different with
, reflecting no separation penalty, for the same state of events (
Figure 13). The value of
is determined by the user as per the environment. If it is commonplace for stacks to get further stacked and dispersed through the workspace,
would be preferred. However, if the chance of stacks getting further stacked is quite low, then
would be more appropriate. A lower
value would reduce the significance of locations resulting from chained stacking in the totem pole. A value of
would completely ignore all chances of stacking and would force the system to suggest the last seen position as the only search location.