Skip to main content

Clusters

AdminCP > Logs > Search Analytics > Clusters. Groups of near-duplicate queries that resolve to the same intent.

What a cluster is

A cluster is a set of normalised queries whose lexical signatures are similar enough to be treated as variants of the same question. "gpu overheating fix", "graphics card temperature high", and "gpu thermal throttle" sit in one cluster; you address the intent once rather than three times.

Clustering runs during the Daily rollup cron pass when Enable intent clustering is on. Stopwords are stripped using the lists selected in Clustering stopword languages before signatures are compared.

List columns

ColumnNotes
ClusterThe canonical query, chosen as the most-searched member of the cluster.
VariantsNumber of distinct normalised queries grouped into the cluster.
SearchesTotal searches across all members of the cluster.
SearchersDistinct members plus distinct guest IP hashes.
Last touchedMost recent rollup pass that updated the cluster.

Clusters with fewer than Minimum cluster size distinct member queries are kept but marked low confidence; they are listed last in the default sort.

Cluster detail

The detail view lists every member query, their per-query counts, and a combined per-day volume chart. Use it to confirm the cluster is coherent before acting on it.

Search-engine diagnostics

When almost all of a cluster's searches return nothing (80 percent or more), the detail view carries a Search engine diagnostics panel that explains why. The Diagnostics sweep cron entry fills it in and refreshes it about once a day.

ReadingMeaning
Suggested tagsTags worth attaching to relevant content so the query starts resolving.
TypoA spelling correction. The members are searching for a word that does not exist on the board; the suggested term does.
Content existsMatching content is already on the board but ranks too low to surface. A relevance problem, not a content gap. A link to the closest match is shown.
Genuine gapNo matching content exists. This is real unmet demand: write it, or point members at it.
EngineEnhanced Search when the verdict came from the Elasticsearch index, Native when it came from the fallback.

The typo and content-versus-gap verdicts require Enhanced Search (XFES). Without it the panel shows tag suggestions only, drawn from the most-used site tags.

When to merge or split

Clustering is heuristic. If two queries that mean different things end up together, the noise will usually wash out as more variants accumulate. If a stubborn miscluster persists, add the offending phrasing to the denylist or use a separate content type's capture configuration to keep the two signals apart.