1. It seems pretty clear that the author’s name alone is not sufficient for disambiguation. The library solution of adding the author’s date of birth also is not a solution for at least two reasons: 1) the folks on Goodreads don’t always have access to the author’s birth date 2) the readers of the site don’t know the author’s birth date, so although it makes the two names different, they aren’t different in a way that is useful. Some sites show the author’s name with a book title, which helps if that is a title that the author is highly associated with. That’s as far as I can go with “answers” at this time.

  2. I understand the argument against using the library solution, and yet I also don’t get it, as it seems like the most easy option to verify externally…

  3. This issue is pretty commonly seen in music as well, with name collisions in artists, albums, and songs all being common. The best handling I’ve seen is at MusicBrainz (which assigns IDs to all such things, and relationships in the DB are always to IDs, not names). They also handle the “multiple authors” and “author credited by different name than usual” cases, which most systems don’t deal with (for music or books). It’s a really good example of how to design a database scheme for creative works, and they’re also working on a book database (BookBrainz) that I’d expect will be similarly high-quality.

  4. I call this the John Williams problem – if your music streaming service can’t tell the difference between the classical guitarist and the film score composer, it’s not much of a music service.

    However, John Williams (1932) and John Williams (1941) is no help to anyone.

    No… John Williams (classical guitarist) and John Williams (film score composer) is what you would want, and this case is no different: A quick glance at those two bios produces Joshua Slocum (sailor & adventurer) and Joshua Slocum (Funeral Consumers Alliance).

    End of confusion.


Leave a Reply

Your email address will not be published. Required fields are marked *