Trying to understand sorting better #1046

ChristopherPAndrews · 2022-09-12T16:46:04Z

ChristopherPAndrews
Sep 12, 2022

I am finding the documentation for sorting to be a little challenging to understand. Almost every example I have found uses sort options. I understand how this works, and it works fine for me.

However, this is not the technique used in the guide. That one is using an accessor function to generate a value not in the data. The docs say

If the sort option is a function but does not take exactly one argument, it is assumed to be a comparator function; otherwise, the sort option is interpreted as a channel value definition and thus may be either a column name, accessor function, or array of values.

Curiously, this documentation doesn't say anything about the sort-options approach.

When I tried these other approaches, non of them worked. I could see the comparator and accessor functions getting called in the log when I added a console.log, but it didn't seem to change the sort order.

Then there is Plot.sort, which I can't figure out how to use at all.

Here is my collection of attempts: https://observablehq.com/@christopherpandrews/plot-sorting

I am really curious if I am missing something fundamental, the documentation is out of date or wrong, or there is a bug in the behavior.

(I'll note that the open issue about sorting on a value not used in the visualization further confused me. That was my original goal as well, and the accessor example seemed to suggest that it was possible, so it was odd that no one suggested it as an approach, making me further doubt the documentation.)

Answered by mbostock

Sep 13, 2022

I expect part of the confusion is the difference between sorting the (x) scale domain, i.e. the order of bars along the x-axis, and sorting the mark, i.e. the z-order in which marks are drawn. The former is known as mark sort options, whereas the latter is known as the basic sort transform. Both are controlled by the mark’s sort option; it just depends on the value of this function (as described in the README):

Note: when the value of the sort option is a string or a function, it is interpreted as a basic sort transform. To use both sort options and a sort transform, use Plot.sort.

The basic sort transform is generally only relevant if you have overlapping marks to avoid occlusion, such…

View full answer

mbostock · 2022-09-13T02:35:06Z

mbostock
Sep 13, 2022
Maintainer

I expect part of the confusion is the difference between sorting the (x) scale domain, i.e. the order of bars along the x-axis, and sorting the mark, i.e. the z-order in which marks are drawn. The former is known as mark sort options, whereas the latter is known as the basic sort transform. Both are controlled by the mark’s sort option; it just depends on the value of this function (as described in the README):

Note: when the value of the sort option is a string or a function, it is interpreted as a basic sort transform. To use both sort options and a sort transform, use Plot.sort.

The basic sort transform is generally only relevant if you have overlapping marks to avoid occlusion, such as dots of varying size in a scatterplot. For an example of how to use the sort transform, consider this scatterplot of world cities where the area of each dot is proportional to the city’s population. If the cities are not sorted, then bigger cities will tend to occlude smaller cites:

Plot.dot(cities100, {
  x: "lng",
  y: "lat",
  r: "population",
  fill: "#ccc",
  stroke: "#000",
  strokeWidth: 1,
  sort: null // disable default sort by descending radius
}).plot()

Whereas if cities are sorted by descending population, many more of the smaller cities will now be visible:

Plot.dot(cities100, {
  x: "lng",
  y: "lat",
  r: "population",
  fill: "#ccc",
  stroke: "#000",
  strokeWidth: 1,
  sort: "population", // explicitly sort by ascending population
  reverse: true // then reverse the order
}).plot()

The Plot.sort transform is simply different syntax for the basic sort transform shown above. For example, here is another way to write the above example:

Plot.dot(
  cities100,
  Plot.sort((a, b) => b.population - a.population, {
    x: "lng",
    y: "lat",
    r: "population",
    fill: "#ccc",
    stroke: "#000",
    strokeWidth: 1
  })
).plot()

Plot.sort is just a more explicit form of the sort option and is generally only needed if you’re chaining together multiple transforms, such as binning and stacking and then sorting.

Here is a notebook with live examples: https://observablehq.com/@observablehq/sort-dots-with-plot

1 reply

ChristopherPAndrews Sep 14, 2022
Author

Thank you. I was indeed missing the distinction between domain sorting and mark sorting. It would be great if this was included in the documentation.

I'll also note that it is unfortunate that these two related ideas are specified with the same sort option and that the one that sorts the marks is not the one called "mark sort options"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Trying to understand sorting better #1046

{{title}}

Replies: 1 comment 1 reply

{{title}}

{{title}}

Select a reply

Trying to understand sorting better #1046

ChristopherPAndrews Sep 12, 2022

Replies: 1 comment · 1 reply

mbostock Sep 13, 2022 Maintainer

ChristopherPAndrews Sep 14, 2022 Author

ChristopherPAndrews
Sep 12, 2022

Replies: 1 comment 1 reply

mbostock
Sep 13, 2022
Maintainer

ChristopherPAndrews Sep 14, 2022
Author