Skip to main content

Privacy

What the add-on captures, where it stores it, how long it keeps it, and which options change the posture.

What is captured

When a search reaches the capture listener and all gates pass, the following may be recorded in xf_mc_sa_search_log:

FieldAlwaysConditionalNotes
Query text (raw and normalised)yesTruncated to 255 UTF-8 characters.
Search typeyesThe content type the search was scoped to, or _global for unscoped searches.
Result countyesInteger; a derived is_zero_result flag is also stored.
TimestampyesUnix epoch, second resolution.
User IDwhen Attribute searches to users is onOtherwise stored as NULL.
IP hashwhen Hash guest IPs is onSHA-256 with a per-install secret. The raw IP is never persisted.

What is not captured

  • Result snippets, clicked results, referrer, user-agent, browsing path.
  • Conversation contents and direct-message searches: excluded at the listener level. These cannot be captured even if the search type is ticked.

Where it is stored

Five tables, all in the site database, prefixed xf_mc_sa_*:

TablePurposeIdentifiable?
xf_mc_sa_search_logRaw rows, one per captured search.Optionally, depending on the posture options above.
xf_mc_sa_search_aggregateDaily rollups grouped by query, type, and day.No, pseudonymised counts only.
xf_mc_sa_search_clusterIntent clusters.No.
xf_mc_sa_search_trending_snapshotTrending detections per period.No.
xf_mc_sa_search_denylistAdmin-curated exclusions.No.

Once a raw row has been rolled up and pruned, the only remaining trace is an anonymised count grouped by (query_hash, search_type, aggregate_date).

How long it is kept

Two independent retention windows:

WindowOptionDefault
Raw rowsRaw log retention30 days
Aggregate rowsAggregate retention730 days

The Retention prune cron entry applies both windows daily, chunked by Retention prune chunk size. When raw rows are removed the aggregate they fed is unaffected: it is pseudonymous and survives until its own window expires.

What members can do

The opt-in to public widgets lives at Account preferences > Search Analytics widgets. The default is controlled by Members opt in to public widgets by default. Opting out excludes that member's queries from public widget aggregates only; admin analytics still include them under the posture rules above.

What admins can do

  • Set Enable capturing to Off to stop all capture immediately.
  • Lower Raw log retention to shorten the identifiable window.
  • Leave Hash guest IPs on unless raw-IP correlation with other logs is required.
  • Keep Attribute searches to users off for the strongest anonymisation posture.
  • Add queries or patterns to the denylist to exclude specific text from capture.
  • Add staff or test accounts to Excluded user groups.

Removing a specific user's data

The raw log is the only identifiable table. To erase one user's raw rows:

DELETE FROM xf_mc_sa_search_log WHERE user_id = ?;

The aggregate, cluster, and trending tables have no user_id column, so subject-access or erasure requests do not need to touch them. An individual cannot be identified from an aggregate row.