This is simply about tracking objects in a video from frame to frame. The point is that we can count objects in each frame but still need to know what happened to each: did it just move or a topological event happened: objects appear and disappear, merge and split. This is an example of an image sequence.
See Iceberg is born as an example.
Another article: Motion tracking.