Why Traditional Patent Search Struggles - Video 2

In this second episode of the 4-part Lighthouse IP AI series, Tim Lagemaat and Kacper Gorski explore how patent search works today and why it remains so challenging. We explain how professionals search across millions of documents using Boolean queries, classification codes, and multiple databases. While precise, this process is time-consuming, highly manual, and often requires days or weeks to complete. The discussion highlights the key limitations of current tools, from language barriers and inconsistent terminology to the fundamental issue of searching by keywords instead of meaning.

Read the transcript of this video below

Tim

“If patent data is publicly available, you might expect it to be easy to search.

But in reality, finding the right information is far more complex than most people think.

So how do professionals actually search for patents today, and where does it go wrong?”

Kacper

“A few things compound the problem.

First, patents are written in very specific legal and technical language.

The same invention can be described completely differently depending on the jurisdiction, the language, and the drafting style of the attorney.

Second is the scale.

You are searching across hundreds of millions of documents in dozens of languages.

A search for battery technology in English will not necessarily find relevant Chinese filings that describe the same concept using completely different terminology.

Thirdly, classification systems like IPC and CPC codes can help, but they were designed decades ago and cannot fully capture the nuance of modern cross-disciplinary inventions.”

Tim

“How and where do people actually search for IP today?”

Kacper

“Most IP professionals use a combination approach.

There are many commercial platforms available, alongside free databases like Espacenet, Google Patents, and individual patent office portals.

The standard approach is Boolean search.

You construct queries using keywords, classification codes, and operators like AND, OR, and NOT.

It is precise, but rigid.

In practice, experienced patent searchers spend days or even weeks building and refining query strings, testing them, reviewing results, and iterating.

It is highly manual and highly specialised work.”

Tim

“So what if I needed to find everything relevant to battery technology? What would I actually be doing right now?”

Kacper

“You would start by identifying the right classification codes for battery technologies.

Then you would build keyword strings around terms like lithium-ion, solid-state electrolyte, cathode materials, and so on.

You would run those searches across multiple databases because there is no single database that covers absolutely everything.

You would probably search in English first, and then wonder whether you are missing Japanese or Chinese filings.

Then you would manually review hundreds or thousands of results, reading abstracts, checking claims, and filtering out irrelevant hits.

A thorough search on a topic like that can realistically take weeks.”

Tim

“Okay, so what goes wrong with that process? And why don’t free patent tools solve it?”

Kacper

“The fundamental problem is that you are matching words rather than searching by meaning.

If a patent describes the same concept using different terminology, you are going to miss it.

Language is another major problem.

A Chinese patent describing a novel battery chemistry may use completely different terminology or may be poorly translated.

Even if it covers the same invention as a US equivalent, you might completely miss it if you are only searching in English.

Then there is the human bottleneck.

Even the best searcher can only review a limited number of documents.

At scale, things inevitably slip through.

Free tools and open-source databases are useful for quick lookups, and they have genuinely made patent data more accessible.

But they still have limitations for professional use: coverage gaps, limited search operators, limited analytics, and limited integration into workflows.

The bigger issue is that both free and commercial tools still largely rely on the same underlying principle: keyword matching.

They are searching text, not understanding concepts.

That is the structural limitation.”

Tim

“So even with commercial tools, you could still be missing things?”

Kacper

“Each professional platform has its own strengths and weaknesses.

That is why many commercial IP departments use multiple tools simultaneously.

But fundamentally, none of them fully solve the meaning problem.

If the concept exists but the words do not match your query, you may never find it.

That is the real gap we are trying to address.”