The Need for Domain-Specific Search Engines

Jun 20, 2022

This weekend I was back home celebrating Father’s Day with my family. My father’s a physician and he was casually scouring through medical studies and literature on the relationship between exercise capacity and cardiovascular risk as one does on a Saturday afternoon. Like many doctors, nurses, and pharmacologists, my father was researching articles on UpToDate.

UpToDate’s an app that’s one of the most trusted repositories for recent and relevant clinical information within the broader medical community. I spend most of my day building search software so naturally, I was curious about how he uses it and what he thinks of it.

After watching him use UpToDate I’m convinced that this world needs domain-specific semantic search engines.

My father was trying to search for a specific statistic - “percent improvement in survival rate for a 1 unit increase in METs?” (METs or metabolic equivalents are a measurement for exercise capacity and resting oxygen uptake in a sitting position). He knew he had seen the statistic in an article before but couldn’t remember the specific article and what section had the statistic. UpToDate’s search returns links to entire articles which can be long and dense.

It took him over 25 min to find the statistic he was looking for.

He had access to curated information he trusted, but he still couldn’t get the value from it quickly because the search couldn’t return the actual statistic only the document that might contain it. I asked, “Why not just Google it?” He responded that while Google might respond with a statistic he’d have no idea which articles the Google search algorithm pulls from.

Can Google automatically rank and curate the most relevant web pages and documents and have the same quality as thousands of volunteer community experts in a nuanced domain such as medicine? Can Google overcome not having access to non-public documents while also dealing with spam and clickbait?

Paul Graham @paulg

This may not just be a problem with Google but possibly also the recipe for beating Google. A startup usually has to start with a niche market. Why not try writing a search engine specifically for some category dominated by SEO spam?

Michael Seibel @mwseibel

A recent small medical issue has highlighted how much someone needs to disrupt Google Search. Google is no longer producing high quality search results in a significant number of important categories.

Maybe but I don’t think those are the right questions to ask.

The question should be how do we help people quickly get value from the knowledge repositories they already trust. This can be UpToDate, community forums, Discord chats, a collection of PDFs, books, etc in any domain. High quality and trust are subjective and depend on the individual, company, or industry. But a universal truth is that at the end of the day people want answers from their knowledge quickly.

Luckily, the Google-like ability to ask questions of your knowledge and extract the answer or the most relevant paragraph from a large corpus of text is accessible outside of Google and available to augment nearly any text search application.

Case in point I fed some UpToDate articles and the statistic question my father had to GPT-3 and out came the answer “For each 1 MET increase in exercise capacity, there was a 12 percent improvement in survival.”

His reaction could be pretty much summarized as

Shut up and take my money | Take my money meme, Money meme, Fry take my money

Jun 24, 2022

Wow I definitely need to checkout that podcast interview. Thanks for sharing!

Expand full comment

Chris Wong

Jun 23, 2022

Interesting tidbit from the Tim Ferriss podcast

https://tim.blog/2022/06/16/jason-portnoy-transcript/

Palantir does exactly what you're talking about!

Jason Portnoy: Yeah. So at a high level, at the highest level, they help people who have really big, like biggest of the big, disparate data sources, disparate meaning they’ve got data in silos all over the place. They help them bring that data together into one cohesive place, so that they can extract insights out of that data. And the thing that we would talk about is, “It’s not necessarily the answers, it’s what questions can you ask of the data,” that really starts to define the value of that data. And so Palantir would pride itself on saying, “We allow you to ask more, and more interesting questions from your data.”

1 more comment...

Anish's Newsletter

Discussion about this post

Ready for more?