Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tm_labels -> looking for algorithms #850

Open
mtennekes opened this issue Mar 25, 2024 · 6 comments
Open

tm_labels -> looking for algorithms #850

mtennekes opened this issue Mar 25, 2024 · 6 comments

Comments

@mtennekes
Copy link
Member

In tmap4, I started the implementation of 2 text layer functions:

  • tm_text Intended to print text to represent data directly, i.e. with visual variables.
  • tm_labels To label points, lines, and/or polygons.

The coordinates in tm_text are as the are (by default). Example:

tm_shape(World, bbox = World) +
	tm_text("name", size="pop_est", col="continent", 
			col.scale = tm_scale_categorical(values = "seaborn.dark"),
			col.legend = tm_legend_hide(),
			size.scale = tm_scale_continuous(values.scale = 4),
			size.legend = tm_legend_hide())

image

For tm_labels the aim is not to print the text at the exact coordinates, but next to (or on top of) the geometries that they refer to.
So, this function requires some intelligent algorithms to place the text.

tmap3 already contained some 'intelligent' features in tm_text, namely the arguments auto.placement, remove.overlap, along.lines and overwrite.lines. I've migrated all of them in tmap4 already (except the last one, which was also not working well in the latest version of tmap3). But there still is a need for further extensions.

Points:

The automatic placement function for points is based on car::pointLabel (earlier part of maptools). This function was also used in tmap3, but I improved it a bit. What is does: it places the labels are close to (but not on top of) the points such that overlap is minimised. Under the hood, simulated annealing and a generic algorithm are used:

metroAfrica = sf::st_intersection(metro, World[World$continent == "Africa", ])
Africa = World[World$continent == "Africa", ]

tm_shape(land) +
	tm_raster("cover_cls", 
			  col.scale = tm_scale(values = cols4all::c4a("brewer.pastel1")[c(3,7,7,2,6,1,2,2)]),
			  col.legend = tm_legend_hide()) +
tm_shape(rivers) +
	tm_lines(lwd = "strokelwd", lwd.scale = tm_scale_asis(values.scale = .3), col = cols4all::c4a("brewer.pastel1")[2]) +
tm_shape(Africa, is.main = TRUE) + 
	tm_borders() +
tm_shape(metroAfrica) +
	tm_symbols(fill = "red", shape = "pop2020", size = "pop2020", 
			   size.scale = tm_scale_intervals(breaks = c(1, 2, 5, 10, 15, 20, 25) * 1e6, values.range = c(0.2,2)),
			   size.legend = tm_legend("Population in 2020"),
			   shape.scale = tm_scale_intervals(breaks = c(1, 2, 5, 10, 15, 20, 25) * 1e6, values = c(21, 23, 22, 21, 23, 22)),
			   shape.legend = tm_legend_combine("size")) +
	tm_labels("name")

image

The automatic removal of overlapping labels is also implemented, but could be improved. For instance, we should be able to specify some sort of weight, determining the importance of the labels.

What we also need: linking lines between labels and points, especially for those that are far away.

Lines

For labeling lines, the option along.lines has been migrated from tmap3. This calculates the angle of the line at the centroid. It works okayish, but needs refinement. Ideally, the labels should be right next to the lines rather than on top:

DE = World[World$name == "Germany",]
rivers_DE = sf::st_intersection(rivers, DE)

tm_shape(DE, crs = 3035) +
 	tm_polygons() +
tm_shape(rivers_DE) +
 	tm_lines(lwd = "strokelwd", lwd.scale = tm_scale_asis()) + 
 	tm_labels("name", bgcol = "grey85")

image

Polygons

No implementation yet. In tmap3, the user could scale the text with "AREA", and use several scaling settings with root, print.tiny, and size.lowerbound. In tmap4, those belong imho to tm_text and the visual variable size. For tm_labels I am looking for a geometry-driven rather data-driven procedure.

What I have in mind is the following configurable procedure like this:

  1. Find a spot in the (multi)polygon where the text fits in. Options could be: stay.at.centroid to prevent that text labels are drawn elsewhere, allow.rotation to allow labels to be rotated if they have a better fit (think of Italy). decrease.font.size to decrease font size in case labels do not fit.
  2. Find spots for the labels of the unlabelled polygons. They should be placed outside any polygon.
  3. We need linking lines between those labels and points. Common application is a standard US state map:

image

Options

Note: tm_text and tm_labels are the same layer function, but with different layer options. As you can see, I've placed them into opt_tm_<layer>, (see #848 option4).

The option names are not finalised, so if you have suggestions for better option names, let me know!

Tips and help welcome!

Do you know any implemented algorithms and we can use? There is ggrepel (mentioned in #808), but so far, I wasn't able to extract the algorithms from the ggplot2 ecosystem. Help is more than welcome.

Also related to #279 and #373, and pinging @Nowosad @Robinlovelace @olivroy @agila5 @rogerbeecham @staropram

@Robinlovelace
Copy link
Collaborator

This all looks good to me. I remember seeing a very good package for generating labels along curved lines but cannot recall what it's called. The {ggrepel} also provides good non-overlapping labels. Could that be useful for tm_text()?

@tim-salabim
Copy link

Maybe also useful for polygons
https://fosstodon.org/@atsyplenkov/112109165619723305

@sjewo
Copy link
Collaborator

sjewo commented Mar 25, 2024

I used FField for automatic label placement (function FFieldPtRep) - but the package is meanwhile archived.

Bildschirmfoto 2024-03-25 um 15 12 42

@mtennekes
Copy link
Member Author

Thanks for your input!

@Robinlovelace Yes, indeed, but see the last paragraph of my opening post.
@tim-salabim Lots of interesting projects mentioned there: https://github.com/atsyplenkov/centerline https://github.com/tylermorganwall/raybevel
@sjewo Awesome, those are the type of functions I am looking for. The source code is still available: https://github.com/cran/FField/blob/master/R/FField.R My understanding is that is essentially does the same thing as car::pointLabel, but I like how the labels are aligned. It doesn't take polygon areas into account, does it?

@sjewo
Copy link
Collaborator

sjewo commented Mar 26, 2024

Hi @mtennekes !

I used a combination of automatic optimization of y coordinate and manual placement in x direction. I updated my code to use sf and changed it to take the borders into account.

library(tmap)
library(sf)
library(FField)

data("NLD_prov")

# rename shape
shp <- NLD_prov

# Centroids of polygon object
pts_label <- pts_anchor <- st_coordinates(st_centroid(shp))

# Identify polygon position relative to center
lrvec <- (pts_anchor[,"X"] >= mean(pts_anchor[, "X"])) + 1

# detect boundary of shp
shp_boundary <- st_coordinates(st_simplify(st_boundary(st_union(shp)), dTolerance = 2000))

# identify x value from border at height y (here one might take also x position into account))
pts_label[,1] <- sapply(pts_label[,2], function(y) {
  row <- which.min(abs(shp_boundary[, "Y"] - y))
  shp_boundary[row, "X"] 
})

# move x coordinate in away from polygon boundary to left or right (lrvec)
pts_label[,1] <- pts_label[,1] + c(-1, 1)[lrvec] * 0.5 * sd(pts_label[,1])

# normalize coordinates
z_pts_label <- scale(pts_label)

# optimize position
pts_label_optimized <- FFieldPtRep(z_pts_label, rep.dist.lmt = 1, iter.max = 20000)

# inverse normalization only in y direction
#pts_label_optimized[,1] <- (pts_label_optimized[,1] * attr(z_pts_label, "scaled:scale")[1]) + attr(z_pts_label, "scaled:center")[1]
pts_label_optimized[,1] <- pts_label[,1]
pts_label_optimized[,2] <- (pts_label_optimized[,2] * attr(z_pts_label, "scaled:scale")[2]) +attr(z_pts_label, "scaled:center")[2]

# create sf-object
shp_label_optimized <- st_as_sf(data.frame(pts_label_optimized, st_drop_geometry(shp)), coords = c("x", "y"), crs = st_crs(shp))

# get position relative to mean for the coordinates
shp_label_optimized$just <- c(-1.5, 1.5)[(st_coordinates(shp_label_optimized)[,"X"] >= mean(st_coordinates(shp_label_optimized)[, "X"])) + 1]

# create connecting lines
m <- lapply(1:nrow(pts_anchor), function(i) {
  matrix(t(data.frame(pts_anchor, pts_label_optimized)[i,]), ncol = 2, byrow = TRUE)
})
edges <- sf::st_sfc(st_multilinestring(x = m), crs = st_crs(shp))

# plot
tm_shape(shp) + 
  tm_dots(size = 0.7) +
  tm_borders() +
  tm_shape(shp_label_optimized) + 
  tm_text("name", size = 0.8, xmod = "just") +
  tm_shape(edges) + 
  tm_lines() +
  tm_layout(frame = FALSE,
            inner.margins.extra = rep(0.3, 4))


lables

@ratnanil
Copy link

I remember seeing a very good package for generating labels along curved lines but cannot recall what it's called.

I was reminded of isoband, but I'm not sure how good the algo is. Placement is pretty nice:

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants