I’ve got a database full of German names. Some have umlauts like in ‘Fürst’. I made a search form and I’m using SQL to find matches. But there’s a problem.
SELECT * FROM people
WHERE first_name LIKE '%fü%'
OR last_name LIKE '%fü%'
This query finds ‘Fürst’ but also ‘Furt’. That’s not what I want. How can I make it only find exact matches with the umlaut?
My database and tables use utf8_unicode_ci encoding. I can’t change that. I’ve already dealt with form input safety using htmlentities and html_entity_decode.
Any ideas on how to make this work right? I need to find only the names with umlauts when searching for them. Thanks for any help!
yo, have u tried using BINARY in ur query? it makes the search case-sensitive and might help with umlauts. like this:
SELECT * FROM people
WHERE BINARY first_name LIKE ‘%fü%’
OR BINARY last_name LIKE ‘%fü%’
just a thought. lemme know if it works for ya!
As someone who’s worked extensively with multilingual databases, I can share a solution that’s worked well for me. Instead of using LIKE, try using COLLATE with a case-sensitive, accent-sensitive collation for your comparison. Here’s an example:
SELECT * FROM people
WHERE first_name COLLATE utf8_bin = 'Fürst'
OR last_name COLLATE utf8_bin = 'Fürst'
This approach forces an exact match, including umlauts. It’s more precise than LIKE and respects the characters as they are. Remember, though, that this method requires exact input matching. If you need partial matching while still respecting umlauts, you might need to combine this with carefully constructed LIKE patterns or consider full-text search options specifically designed for German language nuances.
hey, have u considered using FULLTEXT indexing? it might help with those tricky umlaut searches. i’m curious, wat kind of German names r u dealing with? any cool historical ones? maybe we could brainstorm some creative SQL solutions together!