Privacy
What the add-on captures, where it stores it, how long it keeps it, and which options change the posture.
What is captured
When a search reaches the capture listener and all gates pass, the following may be recorded in xf_mc_sa_search_log:
| Field | Always | Conditional | Notes |
|---|---|---|---|
| Query text (raw and normalised) | yes | Truncated to 255 UTF-8 characters. | |
| Search type | yes | The content type the search was scoped to, or _global for unscoped searches. | |
| Result count | yes | Integer; a derived is_zero_result flag is also stored. | |
| Timestamp | yes | Unix epoch, second resolution. | |
| User ID | when Attribute searches to users is on | Otherwise stored as NULL. | |
| IP hash | when Hash guest IPs is on | SHA-256 with a per-install secret. The raw IP is never persisted. |
What is not captured
- Result snippets, clicked results, referrer, user-agent, browsing path.
- Conversation contents and direct-message searches: excluded at the listener level. These cannot be captured even if the search type is ticked.
Where it is stored
Five tables, all in the site database, prefixed xf_mc_sa_*:
| Table | Purpose | Identifiable? |
|---|---|---|
xf_mc_sa_search_log | Raw rows, one per captured search. | Optionally, depending on the posture options above. |
xf_mc_sa_search_aggregate | Daily rollups grouped by query, type, and day. | No, pseudonymised counts only. |
xf_mc_sa_search_cluster | Intent clusters. | No. |
xf_mc_sa_search_trending_snapshot | Trending detections per period. | No. |
xf_mc_sa_search_denylist | Admin-curated exclusions. | No. |
Once a raw row has been rolled up and pruned, the only remaining trace is an anonymised count grouped by (query_hash, search_type, aggregate_date).
How long it is kept
Two independent retention windows:
| Window | Option | Default |
|---|---|---|
| Raw rows | Raw log retention | 30 days |
| Aggregate rows | Aggregate retention | 730 days |
The Retention prune cron entry applies both windows daily, chunked by Retention prune chunk size. When raw rows are removed the aggregate they fed is unaffected: it is pseudonymous and survives until its own window expires.
What members can do
The opt-in to public widgets lives at Account preferences > Search Analytics widgets. The default is controlled by Members opt in to public widgets by default. Opting out excludes that member's queries from public widget aggregates only; admin analytics still include them under the posture rules above.
What admins can do
- Set Enable capturing to Off to stop all capture immediately.
- Lower Raw log retention to shorten the identifiable window.
- Leave Hash guest IPs on unless raw-IP correlation with other logs is required.
- Keep Attribute searches to users off for the strongest anonymisation posture.
- Add queries or patterns to the denylist to exclude specific text from capture.
- Add staff or test accounts to Excluded user groups.
Removing a specific user's data
The raw log is the only identifiable table. To erase one user's raw rows:
DELETE FROM xf_mc_sa_search_log WHERE user_id = ?;
The aggregate, cluster, and trending tables have no user_id column, so subject-access or erasure requests do not need to touch them. An individual cannot be identified from an aggregate row.