Sorry, we don't support your browser.  Install a modern browser

Add support for mapping custom props to existing unused columns e.g. subdivision1_code, subdivision2_code

Problem Statement

Currently, when sending analytics events via the Plausible frontend API (via plausible tracker or analytics-plugin-plausible), all custom properties are stored in the meta.key / meta.value arrays in ClickHouse’s events_v2 table.

However, many systems (like ours) require frequent filtering by a few high-traffic properties (e.g. companyId, feature, etc). Using arrayExists(...) on meta.key / meta.value is not performant at scale.

Plausible’s events_v2 schema already includes columns like:

  • subdivision1_code
  • subdivision2_code
  • click_id_param
  • scroll_depth

These are largely unused in the default ingestion pipeline. If users could designate certain prop keys (e.g. companyId, feature) to be automatically mapped into those columns, queries would be far more efficient, and this wouldn’t require large schema changes on the user side.

Proposal

  • Introduce a configuration (e.g. in config.toml or runtime env) where users can specify prop-to-column mappings:

    [analytics.custom_column_mapping]
    subdivision1_code = "companyId"
    subdivision2_code = "feature"
    click_id_param = "id"
    scroll_depth = "label"
    `
  • On ingestion (in Plausible backend), inspect payload props; if they contain the configured keys, map them into the designated column fields rather than placing them in meta.

  • Fallback: if props are missing or empty, default to NULL / empty as usual, and still place any remaining props in meta.
  • Maintain backward compatibility: existing usage of meta properties continues to work.

Benefits

  • Dramatically faster filtering, group-by, and querying on those common dimensions.
  • Avoids the overhead and complexity of downstream ETL or materialized views for many users.
  • Leverages existing schema columns in events_v2 that are currently underused.
  • Makes Plausible more extensible for product/behavioral analytics use cases, not just simple pageviews.

Considerations / Edge Cases

  • Type conversion / mismatch (e.g. mapping values String → numeric column)
  • What happens if conflicting props mapped to same column
  • Documentation, migration, versioning
  • Ensuring privacy / PII — users must map only non-sensitive identifiers

Example

A payload:

{
  "name": "ItemOpened",
  "url": "...",
  "props": {
    "companyId": "123",
    "feature": "PRODUCT",
    "label": "P1",
    "id": "456",
    "foo": "bar"
  }
}

With mapping:

  • subdivision1_code = “123”
  • subdivision2_code = “PRODUCT”
  • scroll_depth = “P1”
  • click_id_param = “456”
  • meta remains with { "foo": "bar" }

Then filters like WHERE subdivision1_code = 123 AND subdivision2_code = 'PRODUCT' become efficient column filters.

I hope you’ll consider this as a feature enhancement — it bridges Plausible’s flexibility with performance needed in higher-scale event analytics.

Thanks!

8 months ago