Mergesort vs Timsort on "real" data

Опубликовано: 14 Май 2026
на канале: Thomas Nyberg

16,147

175

This is a video compares Mergesort to Timsort on data which is already mostly linear. It was inspired by this video ( • Tim Sort ) and a comment on that video by Raymond Hettinger asking for data which better shows off the advantages of Timsort. My example of real world data is data in which the first 90% is already sorted and the next 10% is random. Whether this is real enough is questionable, but it does demonstrate how Timsort automatically takes advantage of runs of sorted data.

I did this by only making a tiny hacky change (see this commit: https://github.com/ApproximateIdentit... ) to Timo Bingmann's work. I then did a pretty horrible job recording it (be gentle with me...making videos is apparently not one of my skills...).

Also in case you're paying attention, yes the data that each sort starts with isn't quite the same. They are both generated by the same 90%-10% procedure I described above, but they are not the same. In case you're wondering, no this doesn't matter for this example.