X

Image source: http://publish.illinois.edu/

The creation of the modern search engine has been a huge benefit for the average person. The knowledge of the entire world available to nearly anyone on any internet capable device with just a few keystrokes or taps of a touchscreen. Yet despite the unparalleled power and accessibility that comes from natural language search algorithms, the ability of the average person to find information has severely declined over the past two generations. Keyword searches, the crux of nearly all internet search activity, has grown overly simplified due to technological convenience. Anyone can search for a given keyword, but the ability to come up with related keywords by deriving meaning and context from other words is being lost.

Only a few years ago, I was working in QA, helping to develop machine learning algorithms for a search engine. Developing thesauri, large banks of related terms that could be applied to a specific object or point of interest, was one of my main tasks. The most difficult task I ever had was explaining to an engineer why “KFC” should be considered synonymous with the name, “Kentucky Fried Chicken”.  Far more people than you would think miss that connection.

In the early years of the World Wide Web, when you needed to search for something, you’d hop onto a search engine like Alta Vista, Yahoo, Lycos, or Hotbot (ask your parents), and then type in a keyword to signify the proper name or subject of your search. The problem comes with the fact that since this was in the earlier days of the internet, your results would be pretty much anything that had those particular words tagged in the metadata or in the body of the webpage itself. These search systems were limited. They were not able to use semantics to distinguish user intent to find relevant content.

The problem then arises when I search for information on the “war hammer”, a bludgeoning weapon used by knights in the late middle ages, and get results for “Warhammer”, the fantasy miniature wargame published by Games Workshop. This also highlights another problem with content on the World Wide Web, especially in those days: most content is generated by enthusiasts and hobbyists, and internet content searches more often reflect those more specific materials. Even now, with all the advances in search engine technology, this is still a problem from a user experience standpoint. Natural language search has advanced significantly, but it still has difficulty with deriving user intent with semantics.

Don’t believe me? Hop on Google and type in the keyword “firefly”. Nearly every single result you get on the first few pages will be information on the long-dead Joss Whedon series, rather than anything about the bioluminescent beetle belonging to the order Coleoptera.

Now, what to do when you want the other stuff? You’ll need to start using modifiers and related terms. Back to the first example, let’s say you’re an art student and you’re studying medieval designs for a game, a film, or some other project. You want to find images and designs of war hammers without having to find images of space marines and other plastic miniature figures. Your search query would look like this:

“warhammer+weapon” or “warhammer weapon”

Maybe you’ll even want to look at a specific period where war hammers had a unique look or style. In that case, you might want to add some more terms to your search:

“warhammer+weapon+middle ages” or “warhammer weapon middle ages”

Image source: jefmenguin.com

Following this basic logic allows for more precise and accurate search results. Recognizing that words can represent specific meanings and contexts gives you the ability to find results more in line with what you are looking for. Every so often, however, some keywords become so closely related to a specific word and context that you’ll end up getting counter-intuitive results.

For example, what would happen if you typed in “warhammer+knight” or “warhammer knight”? It should work to get the medieval weapon. It features a modifier to clarify the original subject query. Knights were the main wielders of war hammers in battle (often to stun and incapacitate other knights, and then capture them for ransom), so this should make sense. So what do you get when you do this in a search engine? Well…

…apparently the fantasy war game has a prominent unit in the game also called a “knight”. You can’t win them all, folks.

What’s the important take away to consider from this? Searching for information requires a combination of awareness of the various contexts of words, and also an understanding of how those words relate to what should be completely unrelated subjects. A greater understanding and mastery of relational keywords will be a big help in conducting research both for academic and curiosity purposes.

…and if you really need help, you can always ask a librarian. I know, crazy thought!