Deutschland Tour 2024

Exploring the route of this years ‘Deutschland Tour’ with R

Julian During
2024-08-07

Idea

The “Deutschland Tour” is a big (road) bike race in Germany. By far not as big as the “Tour de France”, but maybe it will get there :).

This years “Deutschland Tour” is a special one. It passes near my hometown and therefore I am wondering where there is a good spot for spectators. For me these key points are important:

To answer this question, I will use my R skills to import and visualise the data.

Reproducibility

In this analysis the following libraries are used:

If you want to reproduce this analysis, you have to perform the following steps:

Alternatively, you could run this analysis by copying and executing chunk by chunk in your R session (installing the above mentioned packages manually).

Data

At first define where gpx files can be read from:

gpx_url <- c("https://www.deutschland-tour.com/fileadmin/content/2_Deutschland_Tour/DT_24/Elite/DT24_E1_SW-HN_177km_inklneutral.gpx", 
     "https://www.deutschland-tour.com/fileadmin/content/2_Deutschland_Tour/DT_24/Elite/DT24_E2_HN-GD_173km_inklneutral.gpx", 
     "https://www.deutschland-tour.com/fileadmin/content/2_Deutschland_Tour/DT_24/Elite/DT24_E3_GD-VS_211km_inklneutral.gpx", 
     "https://www.deutschland-tour.com/fileadmin/content/2_Deutschland_Tour/DT_24/Elite/DT24_E4_Annw-SB_182km_inklneutral.gpx")

Define helper function that reads in stage data using ‘httr2’ (Wickham 2024). Read in html file and search for elements representing html files using a css selector.

stage <- function(gpx_url, css_track_point) {
  resp <- req_perform(request(gpx_url))
  
  gpx_trackpoints <- resp_body_string(resp) |>
    read_html() |>
    html_elements(css_track_point)
  
  tibble(
    lat = html_attr(gpx_trackpoints, "lat"),
    lon = html_attr(gpx_trackpoints, "lon"),
    elevation = html_text(gpx_trackpoints))
}

Define the CSS selector and apply the above mentioned function to all urls resulting in one final data frame:

css_track_point <- "trkpt"
df_stages <- map_df(gpx_url, function(x) stage(x, css_track_point), 
     .id = "stage_id")

Preprocess decisive columns to numeric values:

df_stages_pro <- mutate(df_stages, across(c(lon, lat, elevation), function(x) parse_number(x)))

Analysis

Turn data frame into a sf (Pebesma 2018) object:

sf_stages <- st_as_sf(df_stages_pro, coords = c("lon", "lat"), 
     crs = st_crs(4326))
Simple feature collection with 40834 features and 2 fields
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 6.97561 ymin: 48.05465 xmax: 10.25127 ymax: 50.04494
Geodetic CRS:  WGS 84
# A tibble: 40,834 × 3
   stage_id elevation            geometry
 * <chr>        <dbl>         <POINT [°]>
 1 1             238. (10.23479 50.04494)
 2 1             236. (10.23523 50.04476)
 3 1             236. (10.23523 50.04476)
 4 1             233  (10.23582 50.04443)
 5 1             233  (10.23582 50.04443)
 6 1             233. (10.23637 50.04409)
 7 1             233. (10.23637 50.04409)
 8 1             227. (10.23682 50.04377)
 9 1             227. (10.23682 50.04377)
10 1             228. (10.23678 50.04381)
# ℹ 40,824 more rows

The spatial data is represented as points at the moment. Summarise points per stage, combining them into ‘multipoints’ and one row per stage:

sf_stages_multipoint <- summarise(sf_stages, geometry = st_combine(geometry), 
     .by = stage_id)
Simple feature collection with 4 features and 1 field
Geometry type: MULTIPOINT
Dimension:     XY
Bounding box:  xmin: 6.97561 ymin: 48.05465 xmax: 10.25127 ymax: 50.04494
Geodetic CRS:  WGS 84
# A tibble: 4 × 2
  stage_id                                                    geometry
  <chr>                                               <MULTIPOINT [°]>
1 1        ((10.23479 50.04494), (10.23523 50.04476), (10.23523 50.04…
2 2        ((9.21813 49.1415), (9.21785 49.14149), (9.21706 49.14149)…
3 3        ((9.79703 48.80134), (9.79703 48.80133), (9.7971 48.80137)…
4 4        ((7.96211 49.20398), (7.96207 49.2041), (7.96204 49.20418)…

Cast into lines with this operation:

sf_stages_line <- st_cast(sf_stages_multipoint, "LINESTRING")
Simple feature collection with 4 features and 1 field
Geometry type: LINESTRING
Dimension:     XY
Bounding box:  xmin: 6.97561 ymin: 48.05465 xmax: 10.25127 ymax: 50.04494
Geodetic CRS:  WGS 84
# A tibble: 4 × 2
  stage_id                                                    geometry
  <chr>                                               <LINESTRING [°]>
1 1        (10.23479 50.04494, 10.23523 50.04476, 10.23523 50.04476, …
2 2        (9.21813 49.1415, 9.21785 49.14149, 9.21706 49.14149, 9.21…
3 3        (9.79703 48.80134, 9.79703 48.80133, 9.7971 48.80137, 9.79…
4 4        (7.96211 49.20398, 7.96207 49.2041, 7.96204 49.20418, 7.96…

We can now plot the data using known ‘tidyverse’ (Wickham et al. 2019) techniques. To Include an underlying map, ‘ggspatial’ (Dunnington 2023) is used.

vis_stages_line <- function(sf_stages_line) {
  ggplot() +
    annotation_map_tile(zoom = 8, type = "cartolight") +
    layer_spatial(sf_stages_line, aes(color = stage_id)) +
    theme(legend.position = "bottom") +
    labs(
      color = "Stage Number",
      title = "Deutschland Tour 2024",
      subtitle = "Color indicates Stage Number")
}
gg_stages_line <- vis_stages_line(sf_stages_line)

With ‘leaflet’ (Cheng et al. 2024) we can also have an interactive look:

vis_stages_line_interactive <- function(sf_stages_line) {
  sf_vis <- sf_stages_line |>
    mutate(stage_id = as_factor(stage_id))
  
  factpal <- colorFactor(topo.colors(nrow(sf_vis)), sf_vis$stage_id)
  
  leaflet(sf_vis) |>
    addPolylines(color = ~factpal(stage_id)) |>
    addTiles()
}
gg_stages_line_interactive <- vis_stages_line_interactive(sf_stages_line)

Conclusion

I think I found my perfect spot. By downloading the data, turning it into a spatial format and plotting it interactively, I could easily explore the route. I hope this helps you to find your spot as well! Hope to see you there :)

Cheng, Joe, Barret Schloerke, Bhaskar Karambelkar, and Yihui Xie. 2024. Leaflet: Create Interactive Web Maps with the JavaScript ’Leaflet’ Library. https://CRAN.R-project.org/package=leaflet.
Dunnington, Dewey. 2023. Ggspatial: Spatial Data Framework for Ggplot2. https://CRAN.R-project.org/package=ggspatial.
Landau, William Michael. 2021. “The Targets r Package: A Dynamic Make-Like Function-Oriented Pipeline Toolkit for Reproducibility and High-Performance Computing.” Journal of Open Source Software 6 (57): 2959. https://doi.org/10.21105/joss.02959.
Pebesma, Edzer. 2018. Simple Features for R: Standardized Support for Spatial Vector Data.” The R Journal 10 (1): 439–46. https://doi.org/10.32614/RJ-2018-009.
Ushey, Kevin, and Hadley Wickham. 2024. Renv: Project Environments. https://CRAN.R-project.org/package=renv.
Wickham, Hadley. 2024. Httr2: Perform HTTP Requests and Process the Responses. https://CRAN.R-project.org/package=httr2.
Wickham, Hadley, Mara Averick, Jennifer Bryan, Winston Chang, Lucy D’Agostino McGowan, Romain François, Garrett Grolemund, et al. 2019. “Welcome to the tidyverse.” Journal of Open Source Software 4 (43): 1686. https://doi.org/10.21105/joss.01686.

References

Corrections

If you see mistakes or want to suggest changes, please create an issue on the source repository.

Reuse

Text and figures are licensed under Creative Commons Attribution CC BY 4.0. Source code is available at https://github.com/duju211/deutschland_tour, unless otherwise noted. The figures that have been reused from other sources don't fall under this license and can be recognized by a note in their caption: "Figure from ...".

Citation

For attribution, please cite this work as

During (2024, Sept. 24). Datannery: Deutschland Tour 2024. Retrieved from https://www.datannery.com/posts/deutschland-tour-2024/

BibTeX citation

@misc{during2024deutschland,
  author = {During, Julian},
  title = {Datannery: Deutschland Tour 2024},
  url = {https://www.datannery.com/posts/deutschland-tour-2024/},
  year = {2024}
}