Page MenuHomePhabricator

taavi (Taavi Väänänen)
SREAdministrator

Projects (29)

Today

  • Clear sailing ahead.

Tomorrow

  • Clear sailing ahead.

Saturday

  • Clear sailing ahead.

User Details

User Since
Feb 24 2019, 3:58 PM (325 w, 3 d)
Roles
Administrator
Availability
Available
IRC Nick
taavi
LDAP User
Majavah
MediaWiki User
Taavi [ Global Accounts ]

Recent Activity

Yesterday

taavi added a comment to T393403: Create a security pre-release Phabricator policy manageable by the Security Team.

How does this related to existing acl*release_security_pre_announce?

Wed, May 21, 8:33 PM · MediaWiki-extensions-General, MediaWiki-General, Project-Admins, Release-Engineering-Team, Security, Security-Team
taavi closed T394925: Phakhaphon Phummiphat as Invalid.
Wed, May 21, 2:33 PM · Trash
taavi removed a subtask for T60937: Add "under development" stage before "proposed" stage for OAuth consumers: T394925: Phakhaphon Phummiphat .
Wed, May 21, 2:32 PM · MediaWiki-Platform-Team (Roadmap), Patch-Needs-Improvement, MediaWiki-extensions-OAuth
taavi removed a parent task for T394925: Phakhaphon Phummiphat : T60937: Add "under development" stage before "proposed" stage for OAuth consumers.
Wed, May 21, 2:32 PM · Trash
taavi removed a parent task for T394919: Add accessable mainpage for non-logged-in readers on zh.arbcom: T394925: Phakhaphon Phummiphat .
Wed, May 21, 2:32 PM · Patch-For-Review, Chinese-Sites, Wikimedia-Site-requests
taavi removed a parent task for T394914: Update emptyUserGroup.php to optionally support creating log entries for removal: T394925: Phakhaphon Phummiphat .
Wed, May 21, 2:32 PM · MW-1.45-notes (1.45.0-wmf.2; 2025-05-20), Trust and Safety Product Sprint (Sprint Key Lime Pie (May 5 - May 23)), Temporary accounts (Major pilot wiki deployment)
taavi removed a parent task for T394913: Consider passing only type conversion information to the evaluator (no types): T394925: Phakhaphon Phummiphat .
Wed, May 21, 2:32 PM · Epic, function-schemata, function-evaluator, function-orchestrator, Abstract Wikipedia team
taavi removed a parent task for T394916: 403 Forbidden on Gerrit: T394925: Phakhaphon Phummiphat .
Wed, May 21, 2:32 PM · Gerrit
taavi removed a parent task for T394920: Enable local file upload on zh.arbcom: T394925: Phakhaphon Phummiphat .
Wed, May 21, 2:32 PM · Patch-For-Review, Chinese-Sites, Wikimedia-Site-requests
taavi removed a parent task for T394917: Capacity planning for Wikifunctions services: T394925: Phakhaphon Phummiphat .
Wed, May 21, 2:32 PM · Epic, Essential-Work, function-schemata, function-evaluator, function-orchestrator, Abstract Wikipedia team
taavi removed a parent task for T394921: PuppetConstantChange on clouddumps100[12]: T394925: Phakhaphon Phummiphat .
Wed, May 21, 2:32 PM · Patch-For-Review, Data-Platform-SRE, cloud-services-team (FY2024/2025-Q3-Q4)
taavi removed a parent task for T394918: Should afterTest failures make test fail?: T394925: Phakhaphon Phummiphat .
Wed, May 21, 2:32 PM · Testing Support
taavi removed subtasks for T394925: Phakhaphon Phummiphat : T394913: Consider passing only type conversion information to the evaluator (no types), T394914: Update emptyUserGroup.php to optionally support creating log entries for removal, T394916: 403 Forbidden on Gerrit, T394917: Capacity planning for Wikifunctions services, T394918: Should afterTest failures make test fail?, T394919: Add accessable mainpage for non-logged-in readers on zh.arbcom, T394920: Enable local file upload on zh.arbcom, T394921: PuppetConstantChange on clouddumps100[12], T394923: [Hiring] review application 428854492.
Wed, May 21, 2:32 PM · Trash
taavi removed a parent task for T394923: [Hiring] review application 428854492: T394925: Phakhaphon Phummiphat .
Wed, May 21, 2:32 PM · Essential-Work, Editing-team
taavi closed T366935: Cloud VPS mail servers should drop mail sent from non-supported domains as Resolved.

Done, and documented at https://wikitech.wikimedia.org/wiki/Help:Email_in_Cloud_VPS.

Wed, May 21, 1:57 PM · cloud-services-team, Cloud-VPS
taavi added a comment to T394803: Unbreak deployment-prep config-master.

No, not really. Some of that data comes from etcd, but there are also files like https://config-master.wikimedia.org/known_hosts that are generated from PuppetDB and quite useful to have on deployment-prep as well. I believe this was previously applied to the puppetmaster VM, but that was lost in the Puppet 5 -> 7 migration. On the wikiprod side this service was split to a tiny separate VM and that's a route deployment-prep could use as well.

Wed, May 21, 1:40 PM · Beta-Cluster-Infrastructure
taavi closed T394823: toolsbeta sudo rules for new tools being created on wrong project as Resolved.

This is fixed, and I moved all the existing rules to the correct OU.

Wed, May 21, 1:05 PM · cloud-services-team, Striker
taavi closed T394775: wmcs-enc-cli: keystoneauth1.exceptions.http.Forbidden: You are not authorized to perform the requested action: identity:list_services. as Resolved.
Wed, May 21, 12:37 PM · cloud-services-team, Cloud-VPS
taavi claimed T211575: Enable IPv6 on toolforge.org.
Wed, May 21, 12:10 PM · Patch-For-Review, cloud-services-team, Toolforge, IPv6
taavi closed T375523: [toolforge-prometheus] upgrade to bookworm, a subtask of T387005: Toolforge: migrate to Debian Bookworm or later, as Resolved.
Wed, May 21, 9:56 AM · Toolforge, cloud-services-team
taavi closed T375523: [toolforge-prometheus] upgrade to bookworm, a subtask of T393697: Rebuild Toolforge Prometheus nodes in v6-dualstack network, as Resolved.
Wed, May 21, 9:56 AM · Toolforge (Toolforge iteration 20), IPv6, cloud-services-team
taavi closed T375523: [toolforge-prometheus] upgrade to bookworm as Resolved.
Wed, May 21, 9:56 AM · cloud-services-team, Toolforge
taavi closed T393697: Rebuild Toolforge Prometheus nodes in v6-dualstack network, a subtask of T392509: Enable IPv6 for Toolforge services, as Resolved.
Wed, May 21, 9:55 AM · IPv6, Toolforge, cloud-services-team
taavi closed T393697: Rebuild Toolforge Prometheus nodes in v6-dualstack network as Resolved.
Wed, May 21, 9:55 AM · Toolforge (Toolforge iteration 20), IPv6, cloud-services-team
taavi renamed T394823: toolsbeta sudo rules for new tools being created on wrong project from toolsbeta sudo rules for new projects being created on wrong project to toolsbeta sudo rules for new tools being created on wrong project.
Wed, May 21, 8:07 AM · cloud-services-team, Striker
taavi added a comment to T394859: Quarry WMCloud (ruwiki_p, section s6) experiencing sustained replication lag (~16 h).

This is due to a hardware issue with one of the hosts involved in the replication chain to the wiki replicas: T394624: db1155 HW memory errors

Wed, May 21, 7:55 AM · Quarry, Data-Services, cloud-services-team
taavi added a comment to T394883: cloudservices2005-dev backups are clogging all backups.

Yes please.

Wed, May 21, 7:54 AM · Cloud-VPS, cloud-services-team, Data-Persistence-Backup, bacula
taavi claimed T394883: cloudservices2005-dev backups are clogging all backups.

Thanks for the poke. This host seems to have accidentally gotten rebooted back to the broken kernel from T393366: Regression in RAID10 software RAID with 6.1.135 and was in a locked up state. I've rebooted it and upgraded it to a working kernel which should resolve those backup issues.

Wed, May 21, 7:53 AM · Cloud-VPS, cloud-services-team, Data-Persistence-Backup, bacula

Tue, May 20

taavi edited projects for T394823: toolsbeta sudo rules for new tools being created on wrong project, added: cloud-services-team; removed Toolsbeta-Tools.
Tue, May 20, 6:28 PM · cloud-services-team, Striker
taavi created T394803: Unbreak deployment-prep config-master.
Tue, May 20, 4:17 PM · Beta-Cluster-Infrastructure
taavi edited projects for T394790: Failures when draining certain VMs with attached cinder volumes (coibot-2), added: Cloud-VPS; removed decommission-hardware.
Tue, May 20, 3:29 PM · Cloud-VPS, cloud-services-team
taavi added a project to T394775: wmcs-enc-cli: keystoneauth1.exceptions.http.Forbidden: You are not authorized to perform the requested action: identity:list_services.: Cloud-VPS.
Tue, May 20, 2:03 PM · cloud-services-team, Cloud-VPS
taavi created T394775: wmcs-enc-cli: keystoneauth1.exceptions.http.Forbidden: You are not authorized to perform the requested action: identity:list_services..
Tue, May 20, 1:09 PM · cloud-services-team, Cloud-VPS
taavi edited projects for T394617: Create existencelinks table in production, added: DBA; removed cloud-services-team, Data-Services, Data-Engineering.
Tue, May 20, 12:17 PM · DBA
taavi updated the task description for T390954: Set up x3 replication to wikireplicas.
Tue, May 20, 11:20 AM · Patch-For-Review, User-notice, cloud-services-team, Data-Services, DBA, Wikidata
taavi renamed T394754: Error while enabling event registration: Wikimedia\Rdbms\DBQueryError: Error 1054: Unknown column 'event_type' in 'INSERT INTO' Function: MediaWiki\Extension\CampaignEvents\Event\Store\EventStore::saveRegistration Query: INSERT INTO `campaign_events` (event_name,event from Error while enabling event registration to Error while enabling event registration: Wikimedia\Rdbms\DBQueryError: Error 1054: Unknown column 'event_type' in 'INSERT INTO' Function: MediaWiki\Extension\CampaignEvents\Event\Store\EventStore::saveRegistration Query: INSERT INTO `campaign_events` (event_name,event.
Tue, May 20, 10:20 AM · CampaignEvents, Wikimedia-production-error, Campaign-Registration
taavi changed the subtype of T394754: Error while enabling event registration: Wikimedia\Rdbms\DBQueryError: Error 1054: Unknown column 'event_type' in 'INSERT INTO' Function: MediaWiki\Extension\CampaignEvents\Event\Store\EventStore::saveRegistration Query: INSERT INTO `campaign_events` (event_name,event from "Bug Report" to "Production Error".
Tue, May 20, 10:19 AM · CampaignEvents, Wikimedia-production-error, Campaign-Registration

Mon, May 19

taavi closed T394691: puppet-enc issue with Hiera values starting with a colon due to PyYAML and Ruby YAML parsing differences as Resolved.
Mon, May 19, 3:50 PM · cloud-services-team, Cloud-VPS
taavi claimed T394691: puppet-enc issue with Hiera values starting with a colon due to PyYAML and Ruby YAML parsing differences.
Mon, May 19, 3:40 PM · cloud-services-team, Cloud-VPS
taavi created T394691: puppet-enc issue with Hiera values starting with a colon due to PyYAML and Ruby YAML parsing differences.
Mon, May 19, 3:39 PM · cloud-services-team, Cloud-VPS
taavi added a comment to T382171: Install ORES extension on idwiki.

checking on quarry.wmcloud.org I see that the table isn't there

Mon, May 19, 3:23 PM · Wikimedia-Extension-setup, MediaWiki-extensions-ORES, Wikimedia-Site-requests, Machine-Learning-Team, Patch-For-Review, Moderator-Tools-Team
taavi added a comment to T376400: Redesign wikitech-static.

Can you point me to some specific examples? My half-baked spot checks (e.g. http://ec2-54-81-201-239.compute-1.amazonaws.com/wiki/Eqiad_data_center.html#/media/File:Eqiad_logical.png) seem to be hosted locally (or I am misunderstanding something fundamental).

Mon, May 19, 1:57 PM · Patch-For-Review, serviceops-radar, SRE-Unowned, SRE, wikitech.wikimedia.org
taavi updated the task description for T332478: Decouple Toolforge API gateway authentication from Kubernetes certificates.
Mon, May 19, 1:17 PM · cloud-services-team, Toolforge
taavi added a comment to T332478: Decouple Toolforge API gateway authentication from Kubernetes certificates.

@taavi should I close this as duplicate? Or do you want to refresh/extend the oauth+dedicated auth server specific proposal?

Mon, May 19, 11:05 AM · cloud-services-team, Toolforge
taavi added a comment to T394035: Decision request - Tool account management and Striker.

Option Purple is my favourite so far, but I'm still a bit confused about how the new service would look like. Apologies if I'm fixating on this, we can also split this topic into a separate Decision Request or Design Document.

Mon, May 19, 10:57 AM · cloud-services-team, Striker, Cloud Services Proposals
taavi added a comment to T394629: EmailAuth uses account language for mail content, but content language for times.

Huh. In that case probably worth noting that I'm actually not sure if that particular account is set to English, I just saw a message with mixed languages and figured that is a bug.

Mon, May 19, 8:58 AM · Patch-For-Review, MediaWiki-extensions-EmailAuth
taavi created T394629: EmailAuth uses account language for mail content, but content language for times.
Mon, May 19, 8:32 AM · Patch-For-Review, MediaWiki-extensions-EmailAuth

Sun, May 18

taavi created T394608: CampaignEvents should degrade gracefully when querying events for a page fails.
Sun, May 18, 12:19 PM · CampaignEvents

Fri, May 16

taavi awarded T394533: Slack interaction for Wikibugs a Heartbreak token.
Fri, May 16, 5:26 PM · Wikibugs
taavi added a comment to T394523: Grant Access to https://idm.wikimedia.org/ for maxbinderWMF.

Hmm, are you intentionally using a new developer account for this instead of the existing 'mbinder' one that your Phabricator shell access is already associated with?

Fri, May 16, 3:37 PM · SRE, LDAP-Access-Requests
taavi reopened T394453: Emails to [email protected] from [email protected] bouncing as "Open".

Re-opening, because that doesn't really solve the problem. We do not want to send Puppet failure emails to these service accounts in the first place.

Fri, May 16, 10:50 AM · cloud-services-team
taavi added a hashtag to DC-Ops: #dcops.
Fri, May 16, 10:28 AM

Thu, May 15

Lucas_Werkmeister_WMDE awarded T378740: scap: announce testserver sync complete before running checks a Doubloon token.
Thu, May 15, 3:56 PM · Scap
taavi changed the status of T394022: install Newsletter extension in English Wikinews from Open to Stalled.

This extension is lacking a formal steward on https://www.mediawiki.org/wiki/Developers/Maintainers, I do not believe it is a good idea to deploy it more widely unless that is addressed.

Thu, May 15, 1:54 PM · Wikimedia-Extension-setup, Wikimedia-Site-requests
taavi closed T144943: Groups and tools only refreshed at login as Resolved.
Thu, May 15, 12:43 PM · Striker
taavi closed T394278: django.core.cache.backends.memcached.MemcachedCache is removed in Django 4.1, a subtask of T359217: Update Django version used in Striker, as Resolved.
Thu, May 15, 12:43 PM · Striker
taavi closed T394278: django.core.cache.backends.memcached.MemcachedCache is removed in Django 4.1 as Resolved.
Thu, May 15, 12:43 PM · Striker
taavi added a comment to T290147: Enable interwiki links to/from Wikitech.

@taavi As the Wikidata Integrations in Wikimedia Projects team, we are happy to review any new changes on the topic (although I can't +2 config changes). I can see you have an open mediawiki-config change, but it's difficult to review without knowing what the next steps will be. I can see you've written some changes are pending answers from T171143, what questions are those, and what impact may the answers make?

Thu, May 15, 9:58 AM · Wikidata Integration in Wikimedia projects (Kanban Board), MW-1.44-notes (1.44.0-wmf.19; 2025-03-04), Patch-For-Review, Wikimedia-Interwiki-links, wikitech.wikimedia.org, Wikidata
taavi added a comment to T376400: Redesign wikitech-static.

The site at http://ec2-54-81-201-239.compute-1.amazonaws.com/ seems to embed images from upload.wikimedia.org, for pages like network diagrams those should also be hosted off-cluster.

Thu, May 15, 9:57 AM · Patch-For-Review, serviceops-radar, SRE-Unowned, SRE, wikitech.wikimedia.org
taavi closed T392792: project-proxy puppetserver CA about to expire as Resolved.
Thu, May 15, 9:25 AM · cloud-services-team, Cloud-VPS
taavi added a project to P76199 (An Untitled Masterwork): Puppet.
Thu, May 15, 8:16 AM · Puppet, Cloud-VPS
taavi added a project to P76199 (An Untitled Masterwork): Cloud-VPS.
Thu, May 15, 8:16 AM · Puppet, Cloud-VPS
taavi changed the edit policy for P76199 (An Untitled Masterwork).
Thu, May 15, 8:16 AM · Puppet, Cloud-VPS
taavi created P76199 (An Untitled Masterwork).
Thu, May 15, 8:15 AM · Puppet, Cloud-VPS
taavi claimed T392792: project-proxy puppetserver CA about to expire.
Thu, May 15, 7:44 AM · cloud-services-team, Cloud-VPS

Wed, May 14

taavi moved T394111: Update release-branch CREDITS prior to next MediaWiki maintenance/security releases from Blocker to Not a blocker on the MW-1.43-release board.
Wed, May 14, 7:52 PM · MediaWiki-Engineering, MW-1.43-release, MW-1.42-release, MW-1.39-release, MediaWiki-Releasing
taavi moved T394111: Update release-branch CREDITS prior to next MediaWiki maintenance/security releases from Blocker to Not a blocker on the MW-1.42-release board.
Wed, May 14, 7:52 PM · MediaWiki-Engineering, MW-1.43-release, MW-1.42-release, MW-1.39-release, MediaWiki-Releasing
taavi moved T394111: Update release-branch CREDITS prior to next MediaWiki maintenance/security releases from Blocker to Not a blocker on the MW-1.39-release board.
Wed, May 14, 7:52 PM · MediaWiki-Engineering, MW-1.43-release, MW-1.42-release, MW-1.39-release, MediaWiki-Releasing
taavi renamed T393140: Update SSH key for apine from Requesting access to eqiad, codfw, bast for apine to Update SSH key for apine.
Wed, May 14, 6:37 PM · SRE, SRE-Access-Requests
bd808 awarded T359972: Dev environment Keystone crashes a Blobhaj token.
Wed, May 14, 4:10 PM · Striker
taavi closed T393836: Creating accounts on votewiki results in error, does not send email, but is created anyway as Resolved.

Not really, since the train moving forward has made backporting to a wmf branch more or less moot at this point.

Wed, May 14, 3:42 PM · MW-1.45-notes (1.45.0-wmf.1; 2025-05-13), MW-1.44-release, Wikimedia-production-error, MediaWiki-extensions-BounceHandler
taavi added a comment to T394035: Decision request - Tool account management and Striker.

it's better to make a Toolforge-specific service than to make something that's in theory generic but in practice only used by us

Agreed! But we should consider carefully the requirements for this service and its interface. I think I would like this service to be as "thin" as possible, does it need to contain Toolforge-specific logic? Could it be a generic LDAP adapter, with some minimal logic to restrict the damage you can do through its API?

Wed, May 14, 2:58 PM · cloud-services-team, Striker, Cloud Services Proposals
taavi added a comment to T394035: Decision request - Tool account management and Striker.

Could it be a generic LDAP adapter, with some minimal logic to restrict the damage you can do through its API?

This would increase the security issues with that service, as would expand the possible actions (just saying, that might make it harder to harden).

Yes, that's fair. What are the write actions that the API needs to perform against LDAP? From a quick search in the Striker codebase, I can only see adding and deleting SSH keys. I imagine it also needs to modify LDAP group membership, but I can't find that in the Striker codebase.

Wed, May 14, 2:52 PM · cloud-services-team, Striker, Cloud Services Proposals
taavi added a member for acl*Batch-Editors: dcaro.
Wed, May 14, 2:27 PM
taavi triaged T394035: Decision request - Tool account management and Striker as Medium priority.
Wed, May 14, 1:29 PM · cloud-services-team, Striker, Cloud Services Proposals
taavi created T394304: Make Puppet able to reliabily restart sssd.
Wed, May 14, 1:18 PM · Cloud-VPS, cloud-services-team
taavi created P76147 (An Untitled Masterwork).
Wed, May 14, 1:13 PM
taavi added a project to T394286: Avoid using codfw expansion cage for non-IPIP LVS-fronted services: SRE.
Wed, May 14, 11:16 AM · SRE
taavi renamed T394280: [components-api] Add admin documentation page from [components-api] Add admin page to [components-api] Add admin documentation page.
Wed, May 14, 9:03 AM · Toolforge (Toolforge iteration 20), Documentation
taavi added a project to T394280: [components-api] Add admin documentation page: Documentation.
Wed, May 14, 9:03 AM · Toolforge (Toolforge iteration 20), Documentation
taavi created T394278: django.core.cache.backends.memcached.MemcachedCache is removed in Django 4.1.
Wed, May 14, 8:58 AM · Striker
taavi claimed T359217: Update Django version used in Striker.
Wed, May 14, 8:38 AM · Striker
taavi added a project to T394070: Wikimedia\Services\NoSuchServiceException: No such service: GrowthExperimentsUserImpactLookup when loading Special:RestSandbox: MW-1.44-release.
Wed, May 14, 8:22 AM · MW-1.44-release, MW-1.45-notes (1.45.0-wmf.2; 2025-05-20), CheckUser-UserInfoCard, Growth-Team, GrowthExperiments, Trust and Safety Product Sprint (Sprint Key Lime Pie (May 5 - May 23)), Trust and Safety Product Team, MediaWiki-REST-API, Wikimedia-production-error
taavi closed T359559: django-ratelimit-backend is not compatible with Django 3.x as Resolved.

This is worked around for now by pinning a commit hash from https://github.com/supertassu/django-ratelimit-backend until T359554: Use IDP for authentication in Striker happens.

Wed, May 14, 7:40 AM · Striker
taavi closed T359559: django-ratelimit-backend is not compatible with Django 3.x, a subtask of T359217: Update Django version used in Striker, as Resolved.
Wed, May 14, 7:40 AM · Striker

Tue, May 13

taavi renamed T394035: Decision request - Tool account management and Striker from [DRAFT] Decision request - Tool account management and Striker to Decision request - Tool account management and Striker.
Tue, May 13, 3:52 PM · cloud-services-team, Striker, Cloud Services Proposals
taavi added a comment to T394035: Decision request - Tool account management and Striker.

In option Purple, would it be possible to deploy the new "LDAP API" service to wikiprod hardware

Yes.

Tue, May 13, 3:52 PM · cloud-services-team, Striker, Cloud Services Proposals
taavi closed T393975: Route deployment-prep Prometheus alerts to the [email protected] mailing list as Resolved.
Tue, May 13, 3:39 PM · cloud-services-team, Cloud-VPS, Beta-Cluster-Infrastructure
taavi closed T393975: Route deployment-prep Prometheus alerts to the [email protected] mailing list, a subtask of T393926: Setup mailing list for automated monitoring reports from Beta Cluster project, as Resolved.
Tue, May 13, 3:39 PM · User-bd808, Beta-Cluster-Infrastructure
taavi added a comment to T394035: Decision request - Tool account management and Striker.

I can definitely see the benefit of the HTTP API, way easier to use than the LDAP interface.

But also, I'm trying to think, how is exposing the REST API different than exposing the LDAP URI for RW directly? It may be the same level of security/auth/authn complications, etc. Or maybe better, because there is usually better tooling, middleware and such, for HTTP.

Tue, May 13, 3:31 PM · cloud-services-team, Striker, Cloud Services Proposals
Chlod awarded T393333: +2 on mediawiki/* for Chlod a Party Time token.
Tue, May 13, 2:45 PM · MediaWiki-Gerrit-Group-Requests
taavi closed T393333: +2 on mediawiki/* for Chlod as Resolved.

I'm not seeing any opposition, so went ahead and implemented this. And wikitech-l post for the records.

Tue, May 13, 2:44 PM · MediaWiki-Gerrit-Group-Requests
taavi updated the task description for T394035: Decision request - Tool account management and Striker.
Tue, May 13, 1:56 PM · cloud-services-team, Striker, Cloud Services Proposals
taavi updated the task description for T394035: Decision request - Tool account management and Striker.
Tue, May 13, 1:50 PM · cloud-services-team, Striker, Cloud Services Proposals
taavi updated the task description for T394035: Decision request - Tool account management and Striker.
Tue, May 13, 1:48 PM · cloud-services-team, Striker, Cloud Services Proposals
taavi created T394035: Decision request - Tool account management and Striker.
Tue, May 13, 1:29 PM · cloud-services-team, Striker, Cloud Services Proposals
taavi closed T394023: Recover missing .kube/config for Toolforge tool `inat2wiki` as Resolved.
taavi@tools-bastion-12:~ $ kubectl sudo delete cm -n tool-inat2wiki maintain-kubeusers-inat2wiki
configmap "maintain-kubeusers-inat2wiki" deleted

The system will now provision new files within a few minutes.

Tue, May 13, 12:32 PM · cloud-services-team, Toolforge
taavi closed T359972: Dev environment Keystone crashes as Resolved.
Tue, May 13, 12:27 PM · Striker
taavi added a comment to T309268: Puppet should prune stale entries from sudoers.d.

1===== NODE GROUP =====
2(2) deploy2002.codfw.wmnet,deploy1003.eqiad.wmnet
3----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
4/etc/sudoers.d/parsoid-admin
5===== NODE GROUP =====
6(2) webperf2003.codfw.wmnet,webperf1003.eqiad.wmnet
7----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
8/etc/sudoers.d/scap_deploy-service_coal
9/etc/sudoers.d/scap_sudo_rules_deploy-service_performance_coal
10===== NODE GROUP =====
11(2) maps2009.codfw.wmnet,maps1009.eqiad.wmnet
12----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
13/etc/sudoers.d/nrpe-check_redis_status_on_port_6379
14/etc/sudoers.d/scap_deploy-service
15/etc/sudoers.d/scap_deploy-service_cassandra-metrics-collector
16/etc/sudoers.d/scap_deploy-service_kartotherian
17/etc/sudoers.d/scap_deploy-service_tilerator
18/etc/sudoers.d/scap_deploy-service_tileratorui
19/etc/sudoers.d/tilerator-admin
20/etc/sudoers.d/tilerator-notification
21===== NODE GROUP =====
22(10) maps[2005-2008,2010].codfw.wmnet,maps[1005-1008,1010].eqiad.wmnet
23----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
24/etc/sudoers.d/scap_deploy-service
25/etc/sudoers.d/scap_deploy-service_cassandra-metrics-collector
26/etc/sudoers.d/scap_deploy-service_kartotherian
27/etc/sudoers.d/scap_deploy-service_tilerator
28/etc/sudoers.d/scap_deploy-service_tileratorui
29/etc/sudoers.d/tilerator-admin
30===== NODE GROUP =====
31(2) puppetmaster2002.codfw.wmnet,puppetmaster1003.eqiad.wmnet
32----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
33/etc/sudoers.d/labs_private_needs_merge
34===== NODE GROUP =====
35(2) cloudcontrol2006-dev.codfw.wmnet,cloudcontrol1011.eqiad.wmnet
36----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
37/etc/sudoers.d/cinder-common
38/etc/sudoers.d/designate_sudoers
39/etc/sudoers.d/glance_sudoers
40/etc/sudoers.d/neutron_sudoers
41/etc/sudoers.d/nova-common
42===== NODE GROUP =====
43(1) cloudcontrol2004-dev.codfw.wmnet
44----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
45/etc/sudoers.d/cinder-common
46/etc/sudoers.d/designate_sudoers
47/etc/sudoers.d/glance_sudoers
48/etc/sudoers.d/neutron_sudoers
49/etc/sudoers.d/nova-common
50/etc/sudoers.d/nrpe-check_check-flavor_aggregates
51===== NODE GROUP =====
52(1) cloudcontrol1006.eqiad.wmnet
53----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
54/etc/sudoers.d/cinder-common
55/etc/sudoers.d/designate_sudoers
56/etc/sudoers.d/glance_sudoers
57/etc/sudoers.d/neutron_sudoers
58/etc/sudoers.d/nova-common
59/etc/sudoers.d/nrpe-check_check-flavor_aggregates
60/etc/sudoers.d/nrpe-check_raid_perc_raid
61/etc/sudoers.d/nrpe-get_raid_status_perccli
62===== NODE GROUP =====
63(2) cloudcontrol2005-dev.codfw.wmnet,cloudcontrol1007.eqiad.wmnet
64----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
65/etc/sudoers.d/cinder-common
66/etc/sudoers.d/designate_sudoers
67/etc/sudoers.d/glance_sudoers
68/etc/sudoers.d/neutron_sudoers
69/etc/sudoers.d/nova-common
70/etc/sudoers.d/nrpe-check_raid_perc_raid
71/etc/sudoers.d/nrpe-get_raid_status_perccli
72===== NODE GROUP =====
73(11) wcqs[2001-2003].codfw.wmnet,wcqs[1001-1003].eqiad.wmnet,wdqs[2024-2027].codfw.wmnet,wdqs1023.eqiad.wmnet
74----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
75/etc/sudoers.d/blazegraph-reload-nginx
76===== NODE GROUP =====
77(2) wdqs2023.codfw.wmnet,wdqs1024.eqiad.wmnet
78----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
79/etc/sudoers.d/blazegraph-reload-nginx
80/etc/sudoers.d/wdqs-test-roots
81===== NODE GROUP =====
82(3) cloudweb2002-dev.wikimedia.org,cloudweb[1003-1004].wikimedia.org
83----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
84/etc/sudoers.d/deployment
85/etc/sudoers.d/mwdeploy
86===== NODE GROUP =====
87(1) cloudcephmon2004-dev.codfw.wmnet
88----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
89/etc/sudoers.d/ceph-osd-smartctl
90/etc/sudoers.d/ceph-smartctl.dpkg-dist
91/etc/sudoers.d/labtest-roots
92===== NODE GROUP =====
93(37) cloudcephosd[1004-1024,1026-1041].eqiad.wmnet
94----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
95/etc/sudoers.d/ceph-osd-smartctl
96/etc/sudoers.d/ceph-smartctl
97===== NODE GROUP =====
98(3) cloudcephosd[2001-2003]-dev.codfw.wmnet
99----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
100/etc/sudoers.d/ceph-osd-smartctl
101/etc/sudoers.d/ceph-smartctl
102/etc/sudoers.d/labtest-roots
103===== NODE GROUP =====
104(1) cloudcephosd2004-dev.codfw.wmnet
105----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
106/etc/sudoers.d/ceph-osd-smartctl
107/etc/sudoers.d/nrpe-check_raid_perc_raid
108/etc/sudoers.d/nrpe-get_raid_status_perccli
109===== NODE GROUP =====
110(3) cloudcephmon[1004-1006].eqiad.wmnet
111----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
112/etc/sudoers.d/ceph-smartctl.dpkg-dist
113===== NODE GROUP =====
114(2) cloudcephmon[2005-2006]-dev.codfw.wmnet
115----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
116/etc/sudoers.d/ceph-smartctl.dpkg-dist
117/etc/sudoers.d/labtest-roots
118===== NODE GROUP =====
119(1) netbox-dev2003.codfw.wmnet
120----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
121/etc/sudoers.d/scap_netbox
122/etc/sudoers.d/scap_netbox_netbox
123/etc/sudoers.d/scap_sudo_rules_netbox_netbox-dev_deploy
124===== NODE GROUP =====
125(23) aqs[2001-2012].codfw.wmnet,aqs[1010-1012,1014-1021].eqiad.wmnet
126----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
127/etc/sudoers.d/scap_deploy-service_aqs
128===== NODE GROUP =====
129(1) cumin2002.codfw.wmnet
130----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
131/etc/sudoers.d/nrpe-check_puppet_run_changes
132/etc/sudoers.d/scap_deploy-homer
133===== NODE GROUP =====
134(1) crm2001.codfw.wmnet
135----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
136/etc/sudoers.d/fr-tech-admins
137===== NODE GROUP =====
138(2) search-loader2002.codfw.wmnet,search-loader1002.eqiad.wmnet
139----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
140/etc/sudoers.d/nrpe-check_dpkg
141===== NODE GROUP =====
142(1) mwmaint1002.eqiad.wmnet
143----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
144/etc/sudoers.d/ldap-admins
145===== NODE GROUP =====
146(1) mwmaint2002.codfw.wmnet
147----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
148/etc/sudoers.d/ldap-admins
149/etc/sudoers.d/nagios_check_mcrouter_client
150===== NODE GROUP =====
151(2) seaborgium.wikimedia.org,serpens.wikimedia.org
152----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
153/etc/sudoers.d/nagios
154===== NODE GROUP =====
155(1) mc-misc2001.codfw.wmnet
156----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
157ssh: connect to host mc-misc2001.codfw.wmnet port 22: Connection timed out
158===== NODE GROUP =====
159(54) cp[2027-2041].codfw.wmnet,cp[6001-6016].drmrs.wmnet,cp[5025-5032].eqsin.wmnet,cp[3074-3081].esams.wmnet,cp[4045-4051].ulsf
160o.wmnet
161----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
162/etc/sudoers.d/nrpe-check_check-varnish-uds-frontend--run-varnish-frontend-1-socket
163/etc/sudoers.d/nrpe-check_check-varnish-uds-frontend--run-varnish-frontend-2-socket
164/etc/sudoers.d/nrpe-check_check-varnish-uds-frontend--run-varnish-frontend-3-socket
165/etc/sudoers.d/nrpe-check_check-varnish-uds-frontend--run-varnish-frontend-4-socket
166/etc/sudoers.d/nrpe-check_check-varnish-uds-frontend--run-varnish-frontend-5-socket
167/etc/sudoers.d/nrpe-check_check-varnish-uds-frontend--run-varnish-frontend-6-socket
168/etc/sudoers.d/nrpe-check_check-varnish-uds-frontend--run-varnish-frontend-7-socket
169===== NODE GROUP =====
170(14) dns[1004-1006,2004-2006,3003-3004,4003-4004,5003-5004,6001-6002].wikimedia.org
171----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
172/etc/sudoers.d/nrpe-check_check_service_restart_ntp-service
173===== NODE GROUP =====
174(1) releases1003.eqiad.wmnet
175----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
176/etc/sudoers.d/sudo-jenkins-slave-docker-pusher
177===== NODE GROUP =====
178(4) cloudnet[2005-2006]-dev.codfw.wmnet,cloudnet[1005-1006].eqiad.wmnet
179----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
180/etc/sudoers.d/neutron_sudoers
181===== NODE GROUP =====
182(34) cloudvirt[1031-1061].eqiad.wmnet,cloudvirtlocal[1001-1003].eqiad.wmnet
183----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
184/etc/sudoers.d/neutron_sudoers
185/etc/sudoers.d/nova-common
186===== NODE GROUP =====
187(9) cloudvirt[2004-2006]-dev.codfw.wmnet,cloudvirt[1062-1067].eqiad.wmnet
188----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
189/etc/sudoers.d/neutron_sudoers
190/etc/sudoers.d/nova-common
191/etc/sudoers.d/nrpe-check_raid_perc_raid
192/etc/sudoers.d/nrpe-get_raid_status_perccli
193===== NODE GROUP =====
194(1) cloudgw2002-dev.codfw.wmnet
195----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
196/etc/sudoers.d/labtest-roots
197===== NODE GROUP =====
198(2) cloudlb[2002-2003]-dev.codfw.wmnet
199----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
200/etc/sudoers.d/labtest-roots
201/etc/sudoers.d/nrpe-check_ferm_active
202===== NODE GROUP =====
203(1) cloudgw2003-dev.codfw.wmnet
204----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
205/etc/sudoers.d/labtest-roots
206/etc/sudoers.d/nrpe-check_raid_perc_raid
207/etc/sudoers.d/nrpe-get_raid_status_perccli
208===== NODE GROUP =====
209(12) an-presto[1006,1008,1010,1012-1015].eqiad.wmnet,dumpsdata1007.eqiad.wmnet,kafka-logging[2004-2005].codfw.wmnet,stat[1009-1
210010].eqiad.wmnet
211----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
212/etc/sudoers.d/nrpe-check_raid_megaraid
213/etc/sudoers.d/nrpe-check_raid_perc_raid
214/etc/sudoers.d/nrpe-get_raid_status_megacli
215/etc/sudoers.d/nrpe-get_raid_status_perccli
216===== NODE GROUP =====
217(13) centrallog2002.codfw.wmnet,centrallog1002.eqiad.wmnet,dbprov2003.codfw.wmnet,dbprov1003.eqiad.wmnet,prometheus[2005-2006].
218codfw.wmnet,prometheus6002.drmrs.wmnet,prometheus[1005-1006].eqiad.wmnet,prometheus5002.eqsin.wmnet,prometheus3003.esams.wmnet,
219prometheus7001.magru.wmnet,prometheus4002.ulsfo.wmnet
220----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
221/etc/sudoers.d/README.dpkg-dist
222===== NODE GROUP =====
223(6) dbprov[2004-2006].codfw.wmnet,dbprov[1004-1006].eqiad.wmnet
224----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
225/etc/sudoers.d/README.dpkg-dist
226/etc/sudoers.d/nrpe-check_raid_perc_raid
227/etc/sudoers.d/nrpe-get_raid_status_perccli
228===== NODE GROUP =====
229(1) krb2002.codfw.wmnet
230----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
231/etc/sudoers.d/README.dpkg-dist
232/etc/sudoers.d/nrpe-check_ferm_active
233===== NODE GROUP =====
234(5) cephosd[1001-1005].eqiad.wmnet
235----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
236/etc/sudoers.d/ceph-smartctl
237/etc/sudoers.d/nrpe-check_ferm_active
238===== NODE GROUP =====
239(1) backupmon1001.eqiad.wmnet
240----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
241/etc/sudoers.d/nrpe-check_mariadb_dump_es4_codfw
242/etc/sudoers.d/nrpe-check_mariadb_dump_es4_eqiad
243/etc/sudoers.d/nrpe-check_mariadb_dump_es5_codfw
244/etc/sudoers.d/nrpe-check_mariadb_dump_es5_eqiad
245===== NODE GROUP =====
246(99) an-test-druid1001.eqiad.wmnet,an-test-presto1001.eqiad.wmnet,arclamp2001.codfw.wmnet,arclamp1001.eqiad.wmnet,cephosd[2001-
2472003].codfw.wmnet,cloudlb2004-dev.codfw.wmnet,cloudlb[1001-1002].eqiad.wmnet,cloudnet[2007-2008]-dev.codfw.wmnet,cuminunpriv100
2481.eqiad.wmnet,doc[2002-2003].codfw.wmnet,doc1003.eqiad.wmnet,durum2001.codfw.wmnet,durum[6001-6002].drmrs.wmnet,durum[1001-1002
249].eqiad.wmnet,durum[5001-5002].eqsin.wmnet,durum3004.esams.wmnet,durum[7001-7002].magru.wmnet,durum[4001-4002].ulsfo.wmnet,ethe
250rpad2002.codfw.wmnet,etherpad1004.eqiad.wmnet,ganeti[2024,2033-2044].codfw.wmnet,ganeti[6001-6004].drmrs.wmnet,ganeti[1027,1031
251,1039-1052].eqiad.wmnet,ganeti[5004-5007].eqsin.wmnet,ganeti[3005-3008].esams.wmnet,gerrit[2002-2003].wikimedia.org,gitlab[1003
252-1004,2002-2003].wikimedia.org,idm-test1001.wikimedia.org,idp-test[1004,2004-2005].wikimedia.org,irc1003.wikimedia.org,krb1002.
253eqiad.wmnet,lists[1004,2001].wikimedia.org,netmon[1003,2002].wikimedia.org,people2003.codfw.wmnet,people1004.eqiad.wmnet,phab10
25404.eqiad.wmnet,planet2003.codfw.wmnet,planet1003.eqiad.wmnet,pybal-test2003.codfw.wmnet,stewards2001.codfw.wmnet,stewards1001.e
255qiad.wmnet,testreduce1002.eqiad.wmnet,testvm2004.codfw.wmnet,vrts2002.codfw.wmnet,vrts1003.eqiad.wmnet
256----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
257/etc/sudoers.d/nrpe-check_ferm_active
258===== NODE GROUP =====
259(1) gerrit1003.wikimedia.org
260----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
261/etc/sudoers.d/nrpe-check_ferm_active
262/etc/sudoers.d/scap_gerrit2
263===== NODE GROUP =====
264(1) phab1005.eqiad.wmnet
265----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
266/etc/sudoers.d/nrpe-check_ferm_active
267/etc/sudoers.d/scap_phab-deploy
268/etc/sudoers.d/scap_sudo_rules_phab-deploy_phabricator_deployment
269===== NODE GROUP =====
270(2) miscweb2003.codfw.wmnet,miscweb1003.eqiad.wmnet
271----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
272/etc/sudoers.d/nrpe-check_ferm_active
273/etc/sudoers.d/scap_deploy-design
274/etc/sudoers.d/scap_deploy-service
275/etc/sudoers.d/scap_deploy-service_iegreview
276===== NODE GROUP =====
277(298) an-presto[1007,1009,1011,1016-1020].eqiad.wmnet,an-redacteddb1001.eqiad.wmnet,an-worker[1157-1178,1180-1184,1187-1208].eq
278iad.wmnet,backup[2012-2014].codfw.wmnet,backup[1012-1014].eqiad.wmnet,db[2153-2243].codfw.wmnet,db[1185-1245,1247-1257].eqiad.w
279mnet,dbstore[1008-1009].eqiad.wmnet,druid[1009-1011].eqiad.wmnet,dse-k8s-worker[1005-1008].eqiad.wmnet,dumpsdata1006.eqiad.wmne
280t,es[2035-2046].codfw.wmnet,es[1035-1046].eqiad.wmnet,kafka-jumbo[1010-1018].eqiad.wmnet,kafka-logging[1004-1005].eqiad.wmnet,k
281afka-stretch2002.codfw.wmnet,ms-be[2081-2089].codfw.wmnet,ms-be[1083-1087,1089-1090].eqiad.wmnet,pc[2015-2017].codfw.wmnet,pc[1
282015-1017].eqiad.wmnet,sretest1003.eqiad.wmnet,thanos-be2005.codfw.wmnet,thanos-be1005.eqiad.wmnet
283----- OUTPUT of 'locate-unmanaged /etc/sudoers.d' -----
284/etc/sudoers.d/nrpe-check_raid_perc_raid
285/etc/sudoers.d/nrpe-get_raid_status_perccli
Tue, May 13, 12:21 PM · Patch-For-Review, Puppet, Infrastructure-Foundations, SRE
taavi claimed T359972: Dev environment Keystone crashes.

The Keystone update done in https://gerrit.wikimedia.org/r/c/labs/striker/+/1145140 fixes the issue at least for me.

Tue, May 13, 11:05 AM · Striker