Let’s use Datashader to understand some of the gameplay mechanics of a hit video game while also making some abstract art.
Datashader
Visualisation
PUBG
Published
May 31, 2020
While browsing Kaggle, I came across this interesting dataset, and I thought it would form the basis for some exciting blog posts.
The dataset contains 65M player deaths, from 720,000 different matches, from PlayerUnknown’s Battlegrounds (PUBG), a wildly popular online game.
An Introduction to PUBG
Wikipedia sums up the aim of the game pretty well: > “In the game, up to one hundred players parachute onto an island and scavenge for weapons and equipment to kill others while avoiding getting killed themselves. The available safe area of the game’s map decreases in size over time, directing surviving players into tighter areas to force encounters. The last player or team standing wins the round.”
But for something a bit less dry but just as accurate, there is this video on Youtube.
Data preprocessing
First, let’s load some of the libraries we will need later.
import globimport pandas as pdimport datashader as dsimport datashader.transfer_functions as tfimport numpy as npimport matplotlib.pyplot as pltplt.rcParams['figure.figsize'] = [15, 15]
Bad key "text.kerning_factor" on line 4 in
/opt/anaconda3/envs/PyMC3/lib/python3.7/site-packages/matplotlib/mpl-data/stylelib/_classic_test_patch.mplstyle.
You probably need to get an updated matplotlibrc file from
https://github.com/matplotlib/matplotlib/blob/v3.1.3/matplotlibrc.template
or from the matplotlib source distribution
The dataset comes in several different .csv files, which we will load and concatenate.
def load_deaths(): li = []for filename in glob.glob("/Users/cooke_c/Documents/Blog_Staging/PUBG/9372_13466_bundle_archive/deaths/*.csv"): df = pd.read_csv(filename) df = df.drop(['match_id','victim_placement','killed_by','killer_name','killer_placement','killer_position_x','killer_position_y','victim_name'],axis='columns') li.append(df) df = pd.concat(li, axis=0, ignore_index=True)return(df)
deaths_df = load_deaths()
Matches in PUBG are limited in time to approximately 32.5 minutes. Let’s create a new categorical variable called “phase”. It will represent which of the following match phases a player died in:
We want to aggregate data from deaths_df, using the ‘victim_position_x’ variable as the x coordinate and ‘victim_position_y’ as the y coordinate. Effectively, we are computing a separate 2D histogram for each category (game phase).
Let’s take a closer look at the lower part of the Erangel map.
We can see three different phases of the game, the early phase in green, the mid-phase in cyan, and the later phase in purple.
I will confess to having played a total of 2 games of PUBG before deciding that playing virtual hide and seek wasn’t that fun. Hence, we can see some clear patterns.
In the early phases of the game, deaths are in and around buildings as players search for supplies and weapons.
In the middle phase, the deaths appear to be more spread over the map, with concentrations on roads and natural chokepoints like bridges.
In the last phase of the game, the decreasing size of the “safe zone” forces the players into a concentrated area for a final stand. This results in the constellation of purple dots spread across the map.