Page MenuHomePhabricator

Audit legacy mediawiki stats used in production dashboards
Closed, ResolvedPublic

Description

The team has suggested a migration that is driven by value. To facilitate this migration, we will use this task to keep track of a list of metrics that are currently in use, which we define as being used in dashboards in Grafana.

Our objective is to generate a list of dashboards that need to be converted, which will serve as a guide during the migration process. We will link these metrics to their respective positions in the queue in the subsequent tasks, and use this task as a prioritized list for the conversion process.

  • scripted audit of dashboards using graphite datasources, emit metrics used
    • establish initial set of tracking metrics
  • identify mechanism to send metrics from script to prometheus (e.g. pushgateway)
  • create graphite metric status dashboard

Top 10 metrics used in dashboards from one time audit (full list P54396)

26 MediaWiki.timing.editResponseTime
14 mw.performance.save
 8 MediaWiki.RevisionSlider.timing.init
 7 MediaWiki.Parsoid.html2wt.setup
 7 MediaWiki.Parsoid.html2wt.selser.serialize
 7 MediaWiki.Parsoid.html2wt.selser.domDiff
 7 MediaWiki.Parsoid.html2wt.init
 6 MediaWiki.wikibase.quality.constraints.type.php.success.entities
 6 MediaWiki.Parsoid.html2wt.total
 6 MediaWiki.Parsoid.html2wt.timePerInputKB

Details

TitleReferenceAuthorSource BranchDest Branch
initial debian packagingrepos/sre/python-verlib2!1herronpackaging-wikimediamain
Customize query in GitLab

Related Objects

StatusSubtypeAssignedTask
OpenNone
Resolvedherron
OpenNone
DuplicateDAlangi_WMF
DuplicateNone
DuplicateNone
OpenJgiannelos
DeclinedKrinkle
Resolvedcolewhite
Resolvedcolewhite
ResolvedDAlangi_WMF
Resolvedfgiunchedi
ResolvedClement_Goubert
ResolvedClement_Goubert
Resolvedcolewhite
Opencolewhite
Resolvedcolewhite
DuplicateNone
ResolvedAndrewTavis_WMDE
ResolvedJgiannelos
ResolvedDAlangi_WMF
Resolvedandrea.denisse
Resolvedandrea.denisse
Resolvedandrea.denisse
Resolvedlarissagaulia
Resolvedcolewhite
ResolvedTarrow
Resolvedcolewhite
Stalledcolewhite
Resolvedcolewhite
OpenAndrewTavis_WMDE
ResolvedAndrewTavis_WMDE
ResolvedAndrewTavis_WMDE
ResolvedAndrewTavis_WMDE
ResolvedAndrewTavis_WMDE
ResolvedAndrewTavis_WMDE
ResolvedAnnWF
ResolvedLucas_Werkmeister_WMDE
ResolvedAndrewTavis_WMDE
Resolvedandrea.denisse
OpenHasanAkgun_WMDE
OpenHasanAkgun_WMDE
OpenHasanAkgun_WMDE
DuplicateNone
Resolvedandrea.denisse
ResolvedTK-999
ResolvedDAlangi_WMF
OpenNone
OpenNone
Resolvedcolewhite
Resolvedcolewhite
Resolvedcolewhite
ResolvedDAlangi_WMF
Resolvedcolewhite
Resolvedcolewhite
Resolvedcolewhite
Resolvedcolewhite
Resolvedandrea.denisse
OpenNone
Resolvedandrea.denisse
OpenNone
DuplicateNone
Resolvedtappof
ResolvedNone
ResolvedAnnWF
Resolvedtappof
DuplicateNone
ResolvedJgiannelos
Resolvedcolewhite
ResolvedkarapayneWMDE
Resolvedcolewhite
ResolvedTarrow
InvalidNone
Resolvedcolewhite
ResolvedSgs
Resolvedcolewhite
Resolvedcolewhite
DuplicateNone
OpenNone
ResolvedCyndymediawiksim
ResolvedSgs
ResolvedCyndymediawiksim
ResolvedSgs
ResolvedSgs
ResolvedCyndymediawiksim
ResolvedCyndymediawiksim
ResolvedPRODUCTION ERRORSgs
ResolvedMichael
DuplicateNone
DuplicateNone
DuplicateNone
DuplicateNone
ResolvedDreamy_Jazz
Resolvedcolewhite
Opentappof
DuplicateNone
DuplicateNone
Resolvedcolewhite
DuplicateNone
Resolved codebug
ResolvedTK-999
Resolvedlarissagaulia
ResolvedAtieno
DuplicateNone
ResolvedAnnWF
ResolvedJgiannelos
OpenNone
Resolvedcolewhite
ResolvedTK-999
DuplicateNone
DeclinedNone
Resolvedcolewhite
DuplicateNone
Resolvedcolewhite
Resolvedtappof
ResolvedSecuritycolewhite
Resolvedcolewhite
Resolvedcolewhite
ResolvedAndrewTavis_WMDE
ResolvedJgiannelos
Resolvedcolewhite
DuplicateNone
InvalidNone
ResolvedJgiannelos
ResolvedJgiannelos
InvalidNone
ResolvedJgiannelos
DuplicateNone
Resolvedlmata
DuplicateNone
InvalidJgiannelos
DuplicateJgiannelos
ResolvedJgiannelos
ResolvedJgiannelos
ResolvedJgiannelos
Resolvedcolewhite
ResolvedAnnWF
InvalidNone
DeclinedNone
Opencolewhite
ResolvedAnnWF
InvalidNone
ResolvedFGoodwin
ResolvedJgiannelos
OpenNone
DuplicateNone
ResolvedAtieno
In Progressandrea.denisse
ResolvedAtieno
ResolvedAtieno
Resolvedcolewhite
ResolvedJgiannelos
ResolvedJgiannelos
Resolvedcolewhite
ResolvedNone
Resolvedtappof
ResolvedDreamy_Jazz
Resolvedcolewhite
Resolvedtappof
OpenNone
OpenNone
OpenNone
DuplicateNone
Resolvedcolewhite
OpenNone
OpenNone
OpenNone

Event Timeline

herron renamed this task from Audit & convert stats in use in production to statslib to Audit legacy mediawiki stats used in production.Nov 9 2023, 2:46 PM
herron renamed this task from Audit legacy mediawiki stats used in production to Audit legacy mediawiki stats used in production dashboards.
herron triaged this task as Medium priority.

I spent some time today experimenting with https://github.com/grafana/cortex-tools, specifically cortextool analyse grafana which looked promising, but unfortunately throws parse errors when it encounters a period in the metric name which makes it not suitable for graphite metrics.

So instead I've been working on a simple script to walk the dashboard api looking for dashboards with graphite datasource, and output the metrics used. However, instead of producing a one time/manual report here I'm thinking we should build some ongoing status reporting.

I'm thinking the next step here is to expand the script to output a few metrics that capture the ongoing state of graphite utilization to something like prometheus push gateway, and build a status dashboard using these metrics. With T350825 we could possibly annotate panels with relevant commits as well. I'll expand the task description to include high level steps for that.

Very draft metric list (to be expanded/refined/clarified)

  • Dashboards using graphite datasource
  • Annotations using graphite datasource
  • Panels using graphite datasource
  • Graphite metric count

Change 980048 had a related patch set uploaded (by Herron; author: Herron):

[operations/puppet@production] grafana: add dashboard graphite usage exporter

https://gerrit.wikimedia.org/r/980048

herron updated the task description. (Show Details)

Change 980048 merged by Herron:

[operations/puppet@production] grafana: add dashboard datasource usage (graphite) exporter

https://gerrit.wikimedia.org/r/980048

herron closed this task as Resolved.EditedJan 17 2024, 8:47 PM

A custom grafana graphite datasource exporter, and a grafana dashboard using these metrics to outline current graphite datasource utilization have been deployed.

This will let us track real-time utilization in terms of how many dashboards and panels are still actively using the legacy graphite datasource (metrics updated hourly)

Dashboard is located at https://grafana.wikimedia.org/d/K6DEOo5Ik/grafana-graphite-datasource-utilization

With that I think we're done here!