Skip to the content.

Authors: Jonas Christophersen, Alessandro Pasta & Alex Incerti
1st July 2020
Read time: 17 minutes

“The one thing that unites all human beings, regardless of age, gender, religion, economic status, or ethnic background, is that, deep down inside, we all believe that we are above-average drivers.” - Dave Barry

Driving has a considerable impact on our lives and the landscape of our cities. Our behavior when driving is indeed a fascinating topic. In this report, we will investigate:

Before we start, here is a video summarizing our project idea:

Driving behavior in the Danish municipalities

On average, in Denmark, people drive at 50km/h, and 25% of cars drive above the speed limit. However, these numbers vary largely from municipality to municipality.

The map below displays the driving behavior in the Danish municipalities. Using the selector-bar you can choose which driving indicator you want to investigate. The color intensity denotes the value for the particular municipality, and if you mouse over a specific municipality you can see the exact numbers.

Observation 1: People drive slower in urban areas. Municipalities like Copenhagen, Odense, Aalborg exhibit a lower average speed, probably due to lower speed limits and more traffic.

Observation 2: Municipalities vary largely in the percentage of people driving above the speed limit. Ringkøbing-Skjern has a high percentage of speeders (44%), Copenhagen (10%), and Frederiksberg (5%) the fewest amount of speeders.

Socioeconomic factors

Why do people exhibit such a variation in their driving behavior across municipalities? Do socioeconomic factors also vary across municipalities?

The map below displays some socioeconomic factors in the Danish municipalities.

Observation 1: People tend to commute from the city to nearby areas. Municipalities close to the main cities (Copenhagen, Århus, Odense) have the highest percentage of ingoing commuters.

Observation 2: Adults with higher education tend to live in the largest cities. Municipalities like Copenhagen and Århus have a higher percentage of adults with higher education. It is interesting to notice that, near Copenhagen, there is a tendency for higher educated people to live in the northern municipalities, compared to the western ones.

The relation between driving behavior and socioeconomic factors

Having seen that the driving behavior differs throughout the different municipalities and that socioeconomic factors do too, we investigated whether there is a correlation between the two. For instance, it would be interesting to investigate:

The visualization below enables the exploration of relationships between the percentage of people driving over the speed limit and various socioeconomic factors. Each dot corresponds to a municipality and the size of the dot is proportional to the number of inhabitants.

Observation 1: Municipalities with many elderly (65 years +) have an increased amount of speed limit violations. There seems to be a positive correlation between the percentage of elderly and the percentage of speed violations. Do the elderly tend to drive more often over the speed limit? Maybe, but by exploring the points in the scatter plot, another interesting aspect appears. It seems like the municipalities presenting the high percentage of elderly are mostly rural areas (countryside of Jutland, Lolland, Bornholm), whereas cities such as Copenhagen, Aarhus or Odense, along with the municipalities in the Copenhagen areas show a lower percentage of elderly and a lower percentage of speed violations as well. The correlation could thus be due to the type of municipalities (hence type of roads, amount of traffic, amount of police checks), which is then reflected in the percentage of elderly.

Observation 2: Commuting is negatively associated with speed limit violations. Municipalities with high percentages of ingoing commuters, such as Frederiksberg, Rødovre, Herlev, tend to have a lower percentage of people driving above the speed limit, and vice versa.

Predicting speed limit violations based on socioeconomic factors

How is a change in the socioeconomic factors associated with a change in the speed limit violations? To answer this, we trained a mathematical (linear regression) model using the socioeconomic factors. We let the model decide which factors were the most significant for the prediction and got the following coefficients:

VariableCoefficient
Intercept119
Percentage of adults (18/66 years)-0.957
Percentage of elderly (65+ years)-0.0448
Percentage of ingoing commuters-0.134
Percentage of adults with higher education-0.300
Percentage of unemployed-3.10
Coefficients for Linear Regression Model

An interactive prediction tool is provided below. The slider-values are restricted to values represented in the data, as to avoid extrapolation.

Observation 1: A 20% increase in the percentage of commuters causes a 2.7% decrease in the percentage of speeders. Commuters usually drive during peak hours, causing traffic congestions. When stuck in traffic, speed limit violations are less likely to occur.

Observation 2: A 20% increase in the percentage of adults with higher education causes a 6% decrease in the percentage of speeders. It is hard to say if this is because highly educated drivers tend to respect the speed limits more or because they tend to live in urban areas (as seen in previous visualizations) which are less prone to speed limit violations.

It should be noted that the R2 value is 0.48 indicating that the model is not great but still capable of explaining around half of the variance in the data.

Predicting the missing municipalities

In the traffic dataset that we have been working with, not all municipalities had enough observations for us to include them in our analysis. Some municipalities are therefore missing in the driving behavior map previously shown. To alleviate this, the linear regression model is used to estimate the percentage of speed limit violations in those municipalities given their socioeconomic factors. Press the Start Prediction to apply the model.

As seen, all Danish municipalities now have their value (real or predicted).

Percentage over speed limit predictions for Danish neighborhoods

As a final added bonus we would like to show you one more thing. Since the driving behavior might vary within each municipality it would be interesting to have a prediction model of the percentage of speed limit violations on a neighborhood level. And that's why we've used a K-Nearest-Neighbor algorithm to predict how driving behavior is across Danish neighborhoods. The visualization is seen below.

It is interesting to see how the color intensities fall around major cities. People really behave in the city! Try finding your own neighborhood and see what it's like.

What a ride! Thank you so much for checking our project out.

You can find the behind-the-scenes jupyter explainer notebook here.