Computer vision is a powerful way to learn more about the world around us. Deep Learning is powering a renaissance in the field and unlocking powerful new capabilities in SLAM (Simultaneous Localisation and Mapping). I want to spend some time exploring this area, and as part of this, I want to work with a source of visually stimulating data.
One exciting data source is video games, and I’ve settled on the fast, frantic, mega-franchise Call of Duty: Black Ops Cold War. Nuketown84 is one of the maps/levels, which presents a gritty, decaying ambience, perfect for learning more about how we can apply computer vision in practice. You can see some of the action here.
If our goal is to recognise where we are in the world based on what we can see, then our starting point is having some ground truth data (where we are). Capturing this ground truth data is an exciting challenge, which I will slowly build up over the next few posts.
One strategy is to infer the player’s position from the mini-map and the on-screen compass. Combining different measurements over time, we should understand where the player is in the world.