Redesign analytics as internal plugin-based architecture #30

Open
opened 2026-04-30 09:46:18 +02:00 by McPringle · 0 comments
McPringle commented 2026-04-30 09:46:18 +02:00 (Migrated from github.com)

Context

We have made several changes to ActivitySummaryService on a separate branch to fix issues in the current analytics implementation. Those changes solved specific problems, but the resulting design became too complex.

Instead of continuing to patch the existing approach, we want to redesign analytics from the ground up with a simpler and more maintainable architecture that incorporates the lessons learned.

Goal

Replace the current analytics implementations with a new internal plugin-based architecture.

This does not mean loading external code or JARs. "Plugin" here refers only to the internal project structure: analytics features should be implemented as isolated, clearly defined modules within the codebase.

The redesign should cover at least:

  • Achievements
  • Personal Records
  • Current Week / Month / Year
  • Additional analytics and evaluation features currently handled by the existing system

Proposed architecture

Core plugin contract

Use a single base interface for all analytics plugins:

  • AnalyticsPlugin
  • AnalyticsPluginType getPluginType()
  • String getPluginKey()

Each plugin must have:

  • exactly one pluginType
  • exactly one pluginKey
  • a single responsibility

pluginType is defined centrally by FitPub through an enum.
pluginKey is defined by the concrete plugin implementation.

Suggested enum values:

  • ACHIEVEMENT
  • PERSONAL_INFO
  • PERSONAL_RECORD
  • SUMMARY
  • TRAINING

Why a single interface

We initially discussed a hierarchy such as AchievementsPlugin, SummaryPlugin, etc., but the current direction is to keep the contract simpler:

  • one interface for all plugins
  • category is expressed via getPluginType()
  • concrete implementation is identified via getPluginKey()

This keeps the system easier to reason about and avoids marker-interface complexity unless category-specific contracts are actually needed later.

Execution model

Analytics recalculation should be handled by a central orchestration service.

FitPub should trigger analytics recalculation when activities are:

  • created
  • uploaded
  • edited
  • deleted

The orchestration service should then:

  • determine which plugins need to run
  • execute plugin recalculation asynchronously
  • run each plugin in its own transaction

Transaction strategy

Each plugin recalculation should run in a separate transaction so that:

  • one failing plugin does not block other plugins
  • partial analytics updates are possible
  • the system remains resilient under plugin-specific failures

Consequence: analytics become eventually consistent, which is acceptable as long as it is a deliberate design decision.

Storage model discussion

Rejected option: one table per plugin

We do not want a separate database table for each individual plugin.
That would create too much schema growth and too much storage boilerplate.

Considered option: one global analytics table

We discussed a fully generic single-table design, but this has risks:

  • payloads become too heterogeneous
  • queries become less clear
  • indexing and validation get harder
  • the model can degrade into an unreadable "catch-all" storage layer

Preferred direction: one table per plugin type

The current preferred direction is:

  • no table per plugin
  • one table per plugin type

This is a good middle ground because plugin types are:

  • defined by FitPub
  • stable
  • expected to change only rarely

This keeps the schema explicit without exploding the number of tables.

Why one table per plugin type

Different plugin types have different semantics and lifecycles:

  • ACHIEVEMENT: earned states or achievement events
  • PERSONAL_RECORD: best values with source/reference activity
  • SUMMARY: current and/or period-based aggregations
  • PERSONAL_INFO: user-specific status/info metrics
  • TRAINING: training-related evaluations, recommendations, or load metrics

A separate table per plugin type allows:

  • clearer data ownership
  • more appropriate constraints and indexes
  • simpler queries
  • easier debugging and maintenance
  • less pressure to over-generalize the storage layer

Open design direction for storage

Within each plugin-type table, entries should still be identified by:

  • user_id
  • plugin_key

Depending on the type, additional fields may be needed, for example:

  • scope identifiers
  • period identifiers
  • related activity_id
  • timestamps such as calculated_at, earned_at, recorded_at

A payload field (for example jsonb) may still be useful, but only as part of a structured per-type model, not as the only storage concept.

Architectural principles

The redesign should enforce these boundaries:

  • period/scope determination is separate from storage
  • plugin calculation is separate from orchestration
  • orchestration is separate from persistence
  • plugins should be isolated and easy to add or replace
  • recalculation triggers should be centralized
  • plugin-specific failures should be isolated

Expected benefits

  • simpler analytics implementation
  • lower coupling than the current ActivitySummaryService design
  • easier to extend with new analytics modules
  • clearer failure isolation
  • more maintainable storage model
  • better long-term readability

Open questions

  • What exact methods should AnalyticsPlugin expose beyond getPluginType() and getPluginKey()?
  • How should the orchestration service determine which plugins to run for a given activity change?
  • Which plugin types need historical data vs. current-state data?
  • For each plugin type, which fields should be modeled explicitly in the table and which should remain in payload data?
  • Do we need a unified storage abstraction over the per-type tables?

Next step

Define a minimal target model for:

  1. AnalyticsPlugin
  2. central orchestrator/service
  3. storage tables per pluginType
  4. one concrete example plugin, likely SUMMARY for "current week"

That should be enough to validate the architecture before migrating all analytics features.

## Context We have made several changes to `ActivitySummaryService` on a separate branch to fix issues in the current analytics implementation. Those changes solved specific problems, but the resulting design became too complex. Instead of continuing to patch the existing approach, we want to redesign analytics from the ground up with a simpler and more maintainable architecture that incorporates the lessons learned. ## Goal Replace the current analytics implementations with a new internal plugin-based architecture. This does **not** mean loading external code or JARs. "Plugin" here refers only to the internal project structure: analytics features should be implemented as isolated, clearly defined modules within the codebase. The redesign should cover at least: - Achievements - Personal Records - Current Week / Month / Year - Additional analytics and evaluation features currently handled by the existing system ## Proposed architecture ### Core plugin contract Use a single base interface for all analytics plugins: - `AnalyticsPlugin` - `AnalyticsPluginType getPluginType()` - `String getPluginKey()` Each plugin must have: - exactly one `pluginType` - exactly one `pluginKey` - a single responsibility `pluginType` is defined centrally by FitPub through an enum. `pluginKey` is defined by the concrete plugin implementation. Suggested enum values: - `ACHIEVEMENT` - `PERSONAL_INFO` - `PERSONAL_RECORD` - `SUMMARY` - `TRAINING` ### Why a single interface We initially discussed a hierarchy such as `AchievementsPlugin`, `SummaryPlugin`, etc., but the current direction is to keep the contract simpler: - one interface for all plugins - category is expressed via `getPluginType()` - concrete implementation is identified via `getPluginKey()` This keeps the system easier to reason about and avoids marker-interface complexity unless category-specific contracts are actually needed later. ## Execution model Analytics recalculation should be handled by a central orchestration service. FitPub should trigger analytics recalculation when activities are: - created - uploaded - edited - deleted The orchestration service should then: - determine which plugins need to run - execute plugin recalculation asynchronously - run each plugin in its own transaction ### Transaction strategy Each plugin recalculation should run in a separate transaction so that: - one failing plugin does not block other plugins - partial analytics updates are possible - the system remains resilient under plugin-specific failures Consequence: analytics become **eventually consistent**, which is acceptable as long as it is a deliberate design decision. ## Storage model discussion ### Rejected option: one table per plugin We do **not** want a separate database table for each individual plugin. That would create too much schema growth and too much storage boilerplate. ### Considered option: one global analytics table We discussed a fully generic single-table design, but this has risks: - payloads become too heterogeneous - queries become less clear - indexing and validation get harder - the model can degrade into an unreadable "catch-all" storage layer ### Preferred direction: one table per plugin type The current preferred direction is: - **no table per plugin** - **one table per plugin type** This is a good middle ground because plugin types are: - defined by FitPub - stable - expected to change only rarely This keeps the schema explicit without exploding the number of tables. ## Why one table per plugin type Different plugin types have different semantics and lifecycles: - `ACHIEVEMENT`: earned states or achievement events - `PERSONAL_RECORD`: best values with source/reference activity - `SUMMARY`: current and/or period-based aggregations - `PERSONAL_INFO`: user-specific status/info metrics - `TRAINING`: training-related evaluations, recommendations, or load metrics A separate table per plugin type allows: - clearer data ownership - more appropriate constraints and indexes - simpler queries - easier debugging and maintenance - less pressure to over-generalize the storage layer ## Open design direction for storage Within each plugin-type table, entries should still be identified by: - `user_id` - `plugin_key` Depending on the type, additional fields may be needed, for example: - scope identifiers - period identifiers - related `activity_id` - timestamps such as `calculated_at`, `earned_at`, `recorded_at` A `payload` field (for example `jsonb`) may still be useful, but only as part of a structured per-type model, not as the only storage concept. ## Architectural principles The redesign should enforce these boundaries: - period/scope determination is separate from storage - plugin calculation is separate from orchestration - orchestration is separate from persistence - plugins should be isolated and easy to add or replace - recalculation triggers should be centralized - plugin-specific failures should be isolated ## Expected benefits - simpler analytics implementation - lower coupling than the current `ActivitySummaryService` design - easier to extend with new analytics modules - clearer failure isolation - more maintainable storage model - better long-term readability ## Open questions - What exact methods should `AnalyticsPlugin` expose beyond `getPluginType()` and `getPluginKey()`? - How should the orchestration service determine which plugins to run for a given activity change? - Which plugin types need historical data vs. current-state data? - For each plugin type, which fields should be modeled explicitly in the table and which should remain in payload data? - Do we need a unified storage abstraction over the per-type tables? ## Next step Define a minimal target model for: 1. `AnalyticsPlugin` 2. central orchestrator/service 3. storage tables per `pluginType` 4. one concrete example plugin, likely `SUMMARY` for "current week" That should be enough to validate the architecture before migrating all analytics features.
Sign in to join this conversation.
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: McPringle/fitpub#30
No description provided.