Wikitech
labswiki
https://wikitech.wikimedia.org/wiki/Main_Page
MediaWiki 1.46.0-wmf.21
first-letter
Media
Special
Talk
User
User talk
Wikitech
Wikitech talk
File
File talk
MediaWiki
MediaWiki talk
Template
Template talk
Help
Help talk
Category
Category talk
Obsolete
Obsolete talk
OfficeIT
OfficeIT talk
Tool
Tool talk
Nova Resource
Nova Resource Talk
Heira
Heira Talk
TimedText
TimedText talk
Module
Module talk
Deployments/Archive/2012
0
4362
2396615
1811457
2026-03-29T08:16:27Z
Minorax
38339
2396615
wikitext
text/x-wiki
{{Deployment archive header|2012}}
== January ==
* Tuesday Jan 2: I18n weekly deployment 18:00-20:00 UTC (10am - 12pm PST) - I18n updates [Niklas]
* Tuesday Jan 2, 1930 - 2130 UTC (1130am - 130pm) - PSC listener deployment/queue handling updates [Arthur, Jeremy]
* Tuesday, January 3, 22:00-2300 UTC (2pm-3pm PST): Parser cache purge script deployment [Tim, Asher]
* Wednesday, January 4, 2012 19:00-21:00 UTC (11am-12pm PST): Editor Engagement weekly deployment - MoodBar updates [Roan, RMoen, BSitu]
* Wednesday, 20:00-21:00 UTC (12PM-1PM PST): ArticleFeedbackV5 (test for new feedback links) [Roan/Fabrice/OmniTI]
* Monday - afternoon PST / late evening/night UTC: Testing & deployment of code relevant to potential SOPA blackout, e.g. CongressLookup extension
* Tuesday - postponed lily replacement [Mark]
* Wednesday - 5:00 AM UTC (Tuesday 9:00 PM PST): SOPA blackout via CentralNotice [Ryan Kaldari / Brandon Harris]
* [POSTPONED] <s>Wednesday - 18:00-19:00 UTC (10:00-11:00AM PST): locke maintenance</s>
* Rolling decommissioning of old Squid servers
* Friday, Jan 20 - 10 PM UTC (2 PM PST) - Set up new English Wikipedia replicas
* Rolling decommissioning of old Squid servers continues
* Monday Jan 23, 20:00 UTC (11:00 PST): Locke maintenance [Chris/Mark]
* Monday Jan 23, 23:00-00:00 UTC (3:00pm-4:00pm PST): librsvg updates
* Tuesday, through rest of the week: SwiftMedia thumbnail test [Ben/Mark]
* Wednesday Jan 25, 21:00 - 23:00 UTC (1:00pm - 3:00pm PST): FeaturedFeeds deployment [Arthur/Max]
* Wednesday Jan 25, 19:00 UTC (10:00 PST): Testing multicasting in Production TPA datacenter (test completed, plan B initiated)
* Deployed database to add redundancy / capacity - db53 (s1), db26 (s7), db51 (s4) and db50 (s6)
* Wed, 2/25/12. 20:00 UTC -Fix Varnish configuration problem in production bits.wikimedia.org
* Monday Jan 30, 20:00-21:00 UTC (11:00am - 12:00pm PST) Central Notice updates [Arthur/Kaldari]
* Week of Jan 30 - Rolling decommissioning of old Tampa Squid servers continues {ChrisJ} - Done
* Week of Jan 30 - Pre-populating SWIFT thumbnails - ongoing for the week [Ben] - http://ganglia.wikimedia.org/2.2.0/?c=Swift%20pmtpa&h=Swift%20pmtpa%20prod&m=load_one&r=day&s=by%20name&hc=4&mc=2
* Tues Jan 31 2130-2200 UTC (1:30-2:00pm PST) DonationInterface language only updates on regular cluster [Arthur]
== February ==
* Wed Feb 1 (whole week) - Start of Squid @ EQIAD rolling tests and limited deployment [Mark/Peter/Asher]
* Wednesday, February 1, 22:00-23:00 UTC (2pm-3pm PST): Backport/deployment of Swift thumbnail purging code to 1.18 [Aaron]
* Week of February 6-10 - continued Squid @ EQIAD rolling tests and limited deployment [Mark/Peter/Asher] <strong>done</strong>
* Monday, February 6, 22:00-23:00 UTC (2pm-3pm PST): Single shard (1 of 256) redirection of thumbnail traffic to Swift [Ben/Aaron] <strong>done</strong>
* Tuesday, February 7, 18:30-19:00 UTC (10:30-11am PST): Continued redirection of thumbnail traffic to Swift [Ben/Aaron] <strong>done</strong>
* Tuesday, February 7, 22:00-23:00 UTC (2pm-3pm PST): Continued redirection of thumbnail traffic to Swift [Ben/Aaron] <strong>done</strong>
* Wednesday, February 8, 17:30-18:00 UTC (9:30-10am PST): Continued redirection of thumbnail traffic to Swift [Ben/Aaron] <strong>done</strong>
* Wednesday, February 8, 19:00-20:00 UTC (11am-12pm PST): Editor engagement projects (MoodBar, Feedback dashboard, etc) [Roan] <strong>done</strong>
* Wednesday, February 8, 22:00-23:00 UTC (2pm-3pm PST): Continued redirection of thumbnail traffic to Swift [Ben/Aaron] <strong>done</strong>
* Thursday, February 9, 19:00-18:00 UTC (10am-11am PST): Complete redirection of thumbnail traffic to Swift [Ben/Aaron] <strong>done</strong>
* Friday, Feb 10,12:00 - 12:15 UTC (4:00am - 4:15am PST) : Deploy cp1001-1005 as API Squids @ Eqiad <strong>done</strong>
* Week of February 13-15 - More rolling db schema changes throughout week - to prepare for 1.19 release [Asher] <strong>ongoing</strong>
* Monday, February 13(-14), 23:00-01:00 UTC (3pm-5pm PST): MediaWiki 1.19 test deployment to test2
* Wednesday, February 15 (-16), 23:00-01:00 UTC (3pm-5pm PST): MediaWiki 1.19 stage 1 deployment (mediawikiwiki, strategywiki, usabilitywiki, simplewiki, simplewiktionary, hewikisource, frwikisource, eowiki, metawiki, betawikiversity, enwikiquote, enwikibooks)
* Saturday, February 11, 19:00-21:00 UTC (10am-12noon PST); DB9 move to another rack (ChrisJ/RobH)
* The below and future 1.19 deployment dates are pending the resolution of known blocker issues:
** Tuesday, February 21(-22), 23:00-03:00 UTC (3pm-7pm PST): MediaWiki 1.19 stage 2 deployment (commons)
** Wednesday, February 22, 18:00 - 22:00 UTC ( 10:00 am - 2:00 pm PST) - MW 1.19 stage 2 - commons (2nd attempt)
** Thursday, February 23 (-24), 23:00-03:00 UTC (3pm-7pm PST): MediaWiki 1.19 stage 3 deployment (all projects except Wikipedia)
* rolling additional throughout week of Feb 20 - SWIFT deployment and adding more backend thumbnail servers
* Week of Feb 20 - rolling incremental fixes to Search clusters @ Tampa
* Week of Feb 20 spinning up and testing Search node boxes @ EQIAD (phase 1)
* Monday, February 27 - master DB switch for s1, s5, s6 (Asher) - done
* Monday, February 27 (-28), 23:00-03:00 UTC (3pm-7pm PST): MediaWiki 1.19 stage 4 deployment [nlwiki, plwiki] - done
* Wednesday, February 29 (-March 1), 23:00-03:00 UTC (3pm-7pm PST): MediaWiki 1.19 stage 5 deployment all Wikipedia - done
== March ==
* Thursday, March 1: Add 2 new servers to bits.wikimedia.org @ eqiad for capacity growth and redundancy - done
* Thursday, March 8: MobileFrontend updates (see [[mw:Extension:MobileFrontend/Deployments]])
* Week of March 5 - spinning up new SWIFT thumbnail boxes @ Tampa
* Monday, March 5 - putting Swift thumbnail storage back into production
* Tuesday, march 6 2012 - Labs GlusterFS deployment
* Wednesday, March 7, 18:00-19:00 UTC (10am-11am PST): ArticleFeedbackv5 update
* Thursday, March 8 : test serving traffic thru EQIAD upload.wikimedia.org
* Tuesday, March 13: MobileFrontend updates (see [[mw:Extension:MobileFrontend/Deployments#13_March.2C_2012]])
* <s>Week of March 19: ramp up serving traffic thru EQIAD upload.wikimedia.org [Mark B]</s> [POSTPONED]
* Week of March 19: spinning up search node boxes @ EQIAD [Peter/Asher]
* Monday, March 19, 22:00-23:00 UTC (3:00pm-4:00pm PDT): Updated MobileFrontend code [Patrick, Arthur]
* Week of March 19: enwiki schema upgrade for SHA-1 hashes [Asher]
* Tuesday, March 20, 21:00-23:00 UTC (2:00pm-4:00pm PDT): Update MobileFrontend code [Patrick, Arthur]
* Monday 10.00am - 11:00am PST - Enabling Peering port @ EQIAD [Leslie]
* run parallel tests on EQIAD search ( with TAMPA cluster) [JeffG/Peter/Asher]
* Wednesday, March 28: AFTv5 data collection patch [Roan]
* Ongoing: Enhance Varnish and restart serving traffic thru EQIAD upload.wikimedia.org [Mark B]
== April ==
* Ongoing: Partial production rollout Search cluster @ EQIAD (probably Week of April 2) [mark/peter] [Done]
* Ongoing: Wednesday, 17:00-19:00 UTC (10am-12pm PDT): AFTv5 release ("New Feedback form with Abuse/Spam Filters" + some improvements to oversight and metrics stuff.) [Roan]
* Tuesday, April 10, 18:00-20:00 UTC (11am-1pm PDT): MediaWiki 1.20wmf01 deployment - test2 and mediawiki.org
* Thursday April 12, 4-5 PDT : Deploy WP:Zero changes
* Thursday, April 12, 10-noon PDT: AFT5 changes
* Ongoing dev/test: Enhance Varnish and restart serving traffic thru EQIAD upload.wikimedia.org [Mark B]
* Production rollout of new Search cluster @ EQIAD (Time: tbd, April 10) [Mark/Peter/Jeff]
* Daily, EN Database schema changes (to prep for SHA1 implementation - expected to end Tuesday, 4/10/12 [Asher]
* Monday, April 16 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see mw:Extension:MobileFrontend/Deployments) [Patrick, Arthur]
* Monday, April 16, 18:00-20:00 UTC (11am-1pm PDT): MediaWiki 1.20wmf01 deployment - Deploy to commons
* Wednesday, April 18, 18:00-20:00 UTC (11am-1pm PDT): MediaWiki 1.20wmf01 deployment - Deploy to all non-Wikipedia sites (Wiktionary, Wikisource, Wikinews, Wikibooks, Wikiquote, Wikiversity)
* Ongoing dev/test: Enhance Varnish and restart serving traffic thru EQIAD upload.wikimedia.org [Mark B]
* Tuesday, April 17 17:00-18:00 UTC (10am - 11am PDT): MobileFrontend bugfix deployment [Arthur]
* Wednesday, April 18, 19:00-22:00 (12noon - 3pm PDT) - Deploy IPV6 on XO circuit@EQIAD
* Monday, April 23, 18:00-20:00 UTC (11am-1pm PDT): MediaWiki 1.20wmf01 deployment - Deploy to English Wikipedia
* Wednesday, April 25, 18:00-20:00 UTC (11am-1pm PDT): MediaWiki 1.20wmf01 deployment - Deploy to other Wikipedias + misc. remaining wikis
* Monday, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur]
* Tuesday, 16:30-17:30 UTC (10am-11am PDT): Internationalization bug fixes/updates [Niklas / Antoine for git assistance]
* Tuesday, 22:00-23:00 UTC (3pm-4pm PDT): Configuration change for Swift on commons [Aaron, Ben]
* Thursday 17:00-19:00 UTC (10am-noon PDT): ArticleFeedbackv5 updates [Roan]
* Monday or Tuesday: Deploy Oxygen as UDP logging host@EQIAD (2nd attempt)
* Monday, April 30, 18:00-20:00 UTC (11am-1pm PDT): MediaWiki 1.20wmf2 to mediawiki.org and test2
* Monday, April 30, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur]
== May ==
* Tuesday, May 1, 16:30-17:30 UTC (10am-11am PDT): Internationalization bug fixes/updates [Niklas / Antoine for git assistance]
* Wednesday, May 2, 18:00-20:00 UTC (11am-1pm PDT): MediaWiki 1.20wmf2 to non-Wikipedia wikis
* Wednesday, May 2, 22:00-23:30 UTC (3pm-4:30pm PDT): PageTriage initial deployment (early prototype of List View via private url, see [http://ee-prototype.wmflabs.org/wiki/Special:PageTriage labs prototype]) [Roan/Ian] [partial, rescheduled]
* Thursday, May 3, 21:00-21:30 UTC (2pm-2:30pm PDT): Switch ES writes to new cluster23 tables [Asher] '''[done]'''
* Thursday, May 3, 21.00 onwards : script to move /usr/local/apache to /a partition on all non-imagescaler, non-jobrunner apaches '''[done]'''
* Thursday, May 3, 22:00-23:30 UTC (3pm-4:30pm PDT): ArticleFeedbackv5 updates (tentatively moved here due to Monthly Metrics) [Roan] '''[bumped for PageTriage]'''
* Friday, may 4, ongoing - upgrade squid servers@tpa to latest lucid
* Enable EnotifWatchlist on all wikis, 1 h window with ops to check load ([[bugzilla:28026]], [[RT:1784]]) [Scheduling: Reedy] -- ?
* Friday, May 4, 17:00-18:00 UTC (10am-11am PDT): Gerrit 2.3 upgrade, may affect code review (test instance: http://gerrit-dev.wmflabs.org )
* on-going server reboot for those Apache servers running Lucid kernel 2.6.32 (and lower) and over 200 days of uptime
* on-going S2 database refresh
* Friday, May 4, 17:00-19:00 UTC (10am-12pm PDT): Zero Partner Testing
* Monday, May 7, 18:00-20:00 UTC (11am-1pm PDT): MediaWiki 1.20wmf2 deployment - Deploy to English Wikipedia
* Monday, May 7, 21:00-22:00 UTC (2pm-3pm PDT): PageTriage enwiki experimental deploy [Ian/Roan]
* Deploy uplink to rack A4 (sdtpa) (2nd attempt)
* Monday, May 7, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur]
* Tuesday, May 8, 16:30-17:30 UTC (9:30am-10:30am PDT): Internationalization bug fixes/updates [Niklas / Antoine for git assistance / Roan]
* Tuesday, May 8, 1900-2100 UTC (1pm - 3pm PDT): Deploy new swift thumbnail container sharding scheme [Ben / Aaron]
* Tuesday, May 8, 2100-2200 UTC (3pm - 4pm PDT): MobileFrontend bug fixes [Patrick/Arthur]
* Wednesday, May 9, 18:00-20:00 UTC (11am-1pm PDT): MediaWiki 1.20wmf2 deployment - Deploy to other Wikipedias
* Wednesday, May 9, 16:00-18:00 UTC (9am-11am PDT): Wikipedia Zero partner testing [Dan/Patrick]
* Thursday, May 10, 17:00-19:00 UTC (10am-noon PDT): ArticleFeedbackv5 updates [Roan]
* Thursday, May 10, 17:00-19:30 UTC (10am-12:30pm PDT):Wikipedia Zero partner testing [Dan/Patrick] (overlap coordinated with Roan)
* Thursday, May 10, 18:00 - 20:00 UTC (11am - 1pm PDT): Re-home 2nd network transport link @ SDTPA (from core [ router to core switch) [Mark/Leslie]
* setting up external IPs to various servers [Leslie]
* limited Precise testing [Mark]
* OS (Lucid) upgrade/patch train - ongoing
** upgrade/patch all DB servers to latest Lucid release
** upgrade/patch all Squid servers (ESAM)
** upgrade/patch all Squid servers (TPA/EQIAD)
* (continuing) setting up external IPs to various servers [Leslie]
* Monday May 14, 17:00-18:00 UTC (10am-11am PDT): MediaWiki 1.20wmf2 deployment - Deploy to labsconsole
* Monday, May 14 18:00-20:00 UTC (11am-1pm PDT): MediaWiki 1.20wmf3 deployment to test, test2, and mediawiki.org
* <s>Monday, May 14 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur]</s>
* <s>Tuesday, May 15 16:30-17:30 UTC (9:30am-10:30am PDT): Internationalization bug fixes/updates [Niklas / Antoine for git assistance]</s> Cancelled
* Tuesday, May 15 17:30-19:00 UTC (10:30am-noon PDT): SwiftBackend to purge thumbnails instead of calling cloudfiles directly; exercise parallel purging (Rolled back. Will try again Thurs.)
* Tuesday, May 15 20:00-22:00 UTC (1pm-3pm PDT): Editor Engagement deployment window [Ian, Kaldari, Benny]
* Tuesday, May 14 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur]
* Wednesday, May 16 1730-1800 UTC (1030am-11am): MobileFrontend quick fix [Arthur]
* Wednesday, May 16 18:00-20:00 UTC (11am-1pm PDT): MediaWiki deployment to non-Wikipedia wikis
* Wednesday, May 16 17:30-19:00 UTC (9:30am-11am PDT): Wikipedia Zero partner testing [Dan/Patrick]
* Thursday, May 17 17:00-19:00 UTC (10am-noon PDT): ArticleFeedbackv5 updates [Roan]
* Thursday, May 17 17:30-19:00 UTC (10:30m-noon PDT): Wikipedia Zero partner testing [Dan/Patrick]
* Thursday, May 17 20:00-22:00 UTC (1pm-3pm PDT): SwiftBackend to purge thumbnails and SwiftBackend to write thumbnails instead of rewrite.py (some wikis)
* TBD - deploy additional Swift monitoring
* Monday, May 21 18:00-20:00 UTC (11am-1pm PDT): MediaWiki 1.20wmf3 to en.wikipedia.org [RobLa, Aaron, Sam]
*<s> Monday, May 21 20:00-21:00 UTC (1pm-2pm PDT): PageTriage deployment and re-enabling [Roan, Kaldari]</s>
* Monday, May 21 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur]
* Tuesday, May 22 16:30-17:30 UTC (9:30am-10:30am PDT): Internationalization bug fixes/updates [Niklas / Sam]
* Tuesday, May 22, 17:30-19:00 UTC (10:30am-noon PDT): SwiftBackend to write thumbnails instead of rewrite.py (all wikis) [Ben, Aaron]
* Wednesday May 23 18:00-20:00 UTC (11am-1pm PDT): MediaWiki 1.20wmf3 to all wikipedia.org [RobLa, Aaron, Sam]
* Wednesday May 23 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero Partner testing [Dan, Patrick]
* Wednesday May 23 21:00-23:00 UTC (2pm-4pm PDT): SwiftBackend to write thumbnails instead of rewrite.py (all wikis) [Ben, Aaron]
* Thursday May 24 1630-1700 (930am-10am PDT): MobileFrontend bug fix [Arthur]
* Thursday May 24 17:00-19:00 (10am-12pm PDT): ArticleFeedbackv5 updates [Roan]
* Thursday May 24 17:30-19:00 (10:30am-12pm PDT): Wikipedia Zero Partner testing [Dan, Patrick]
* Thursday, May 24 2000-2200 (1pm-3pm PDT): CentralNotice and LandingCheck updates for fundraising [Arthur, Peter Gehres]
* Thursday May 24 21:00-22:00 (2 PM-3PM PDT): UploadWizard push to Commons to pick up recent updates [Aaron, Brion]
* Monday, May 28 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments 1.20wmf4 to test, test2, mediawiki.org] [Sam / RobLa ]
* Tuesday, May 29 07:00-08:00 UTC (00:00am-01:00am PDT): Internationalization bug fixes/updates [Niklas / Sam]
* Wednesday May 30 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments 1.20wmf4 to non-Wikipedia sites (Wiktionary, Wikisource, Wikinews, Wikibooks, Wikiquote, Wikiversity, and a few other sites)] [Aaron / RobLa]
* Wednesday May 30 2100-2300 UTC (2pm - 4pm PDT) Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur]
==June==
* Monday, June 4 1.20wmf4 to enwiki
* Tuesday, June 5 17:30-20:00 (10:30am-1pm PDT): ArticleFeedbackv5 updates [Mathias/Roan]
* Tuesday, June 5 20:00-22:00 (1pm-3pm PDT): PageTriage updates [Benny/Kaldari/Roan]
* Wednesday, June 6 1.20wmf4 to *.wikipedia.org
* Wednesday, June 6, 12:00 UTC - IPV6 day rollout
* Thursday, June 7 16.00-17:00 ( 9am - 10am PDT): Deploy new Swift Monitoring [Ben]
* Thursday, June 7 20:00-22:00 (1pm-3pm PDT): Deploy E3 Experiments/LastModified to en.wiki (reverted)
* Monday, June 11, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] 1.20wmf5 to test, test2, and mediawiki.org [Sam / Aaron / RobLa ]
* Monday, June 11, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur]
* Tuesday, June 12, 09:00-10:00 UTC (02:00am-03:00am PDT): Internationalization bug fixes/updates [Niklas / Sam ]
* Tuesday, June 12 17:30-20:00 UTC (10:30am-1pm PDT): ArticleFeedbackv5 updates [Mathias/Roan]
* Tuesday, June 12, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur]
* Wednesday, June 13 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Wednesday, June 13 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] 1.20wmf5 to all non Wikipedia sites [Sam / Aaron / RobLa]
* Thursday, June 14 15:00-16:00 UTC (8am-9am PDT): [[mw:Extension:Education_Program]] on test2 [Sam / RobLa]
* Thursday, June 14 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Thursday, June 14 17:00-19:00 UTC (10am-12pm PDT): PageTriage deployment & Moodbar changes [Benny]
* Thursday, June 14 22:00-00:00 UTC (3pm-5pm PDT): E3 deployments [Kaldari / Ori]
* Monday, June 18, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] 1.20wmf5 to enwiki (DONE) [Sam / Aaron / RobLa ]
* Monday, June 18 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur]
* Tuesday, June 19 09:00-10:00 UTC (02:00am-03:00am PDT): Internationalization bug fixes/updates [Niklas / Sam]
* Tuesday, June 19 17:00-18:30 UTC (10:00amm-11:30am PDT): swift SSD testing [Ben / RobH]
* Tuesday, June 19, 17:30-19:30 UTC (10:30am-12:30pm PDT): ArticleFeedbackv5 updates [Roan]
* Tuesday, June 19, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur]
* Wednesday, June 20, 13:00-14:00 UTC (6am-7am PDT): Enable [[mw:Extension:Education_Program]] extension on enwiki [Sam, Jeroen]
* Wednesday, June 20, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Wednesday, June 20 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] 1.20wmf5 to Wikipedia sites [Sam / Aaron / RobLa]
* Wednesday June 20 22:00-00:00 UTC (3pm-5pm PDT): E2 and E3 deployments window [Benny / Kaldari / Ori / Alolita]
* Thursday, June 21 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Thursday, June 21 18:00-21:00 UTC (11am-2pm PDT): VisualEditor deployment to mediawiki.org [Roan]
* Thursday, June 21 21:00-23:00 UTC (2pm - 3pm PDT): Mobile redirect deployment for squid [Preilly]
* Thursday, June 21 22:00-00:00 UTC (3pm-5pm PDT): Alternative E2 and E3 deployments window [Benny / Kaldari / Ori / Alolita]
* Ongoing: SWIFT cluster upgrades with SSD drives for faster metadata/listing retrieval
* Monday, June 25 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki 1.20wmf6 to mediawiki.org, test, and test2] [Sam / Aaron / RobLa ]
* Monday, June 25, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur]
* Tuesday, June 26, 09:00-10:00 UTC (02:00am-03:00am PDT): Internationalization bug fixes/updates [Niklas / Sam]
* Tuesday, June 26, 17:30-19:30 UTC (10:30am-12:30pm PDT): ArticleFeedbackv5 updates [Roan] (rolled back)
* Tuesday, June 26, 21:30-22:00 UTC (2:30pm-3:00pm PDT): Putting back be5 (Swift node) back into rotation
* Tuesday, June 26, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur]
* Wednesday, June 27, 14:00-17:00 UTC (7am-10am PDT): limited testing of IPv6 prefix announcement support (Pybal 1.02 release).
* Wednesday, June 27, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Wednesday, June 27, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] [Sam / Aaron / RobLa]
* Thursday, June 28, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Thursday, June 28, 18:00-20:00 UTC (11am-1pm PDT): E2 deployment [Benny / Kaldari]
* Thursday, June 28, 20:00-22:00 UTC (1pm-3pm PDT): E3 deployments [Kaldari / Ori / Alolita]
* Thursday, June 28, 21:00 onwards UTC (2pm onwards PDT): Thumbnail deletion job [Ben/Ariel]
* Friday, June 29, 17:00 - 18:00 UTC (10am - 11am PDT): DNS resolver server@EQIAD
== July ==
* Monday, July 2, 17:00-17:30 UTC (10am-10:30am PDT): Wikipedia Zero launch for Niger [Dan / Patrick] '''Done'''
* Monday, July 2, 18:00-20:00 UTC (11am-1pm PDT): MediaWiki 1.20wmf6 *.wikipedia.org [Sam / Aaron / RobLa ] '''Done'''
* Monday, July 2, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see mw:Extension:MobileFrontend/Deployments) [Patrick, Arthur, Max] '''Happened Tuesday'''
* Tuesday, July 3, 09:00-10:00 UTC (02:00am-03:00am PDT): Internationalization bug fixes/updates [Niklas / Sam] '''Did not take place'''
* Tuesday, July 3, 17:00-17:30 UTC (10am-10:30am PDT): Wikipedia Zero partner testing [Dan / Patrick] '''Done'''
* Tuesday, July 3, 17:30-19:30 UTC (10:30am-12:30pm PDT): ArticleFeedbackv5 updates [Roan] '''Done'''
* Tuesday, July 3, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see mw:Extension:MobileFrontend/Deployments) [Patrick, Arthur, Max] '''Done'''
* Thursday, July 5, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick] '''Done'''
* Thursday, July 5, 18:00-20:00 UTC (11am-1pm PDT): E2 deployment & AFTv5 config change [Benny / Kaldari] '''Done'''
* Thursday, July 5, 20:00-22:00 UTC (1pm-3pm PDT): E3 deployments [Kaldari / Ori / Alolita] '''Done'''
* Thursday, July 5, 22:00-23:00 UTC (3pm - 4pm): Add Swift nodes [Ben] '''Done'''
* Thursday, July 5, 23:00-00:00 UTC (4pm-5pm PDT): Gerrit upgrade [Chad / Ryan] '''Done'''
* Friday, July 6, 13.00 - 14.00 UTC (6am-7am PDT): PyBal/LVS upgrade [Mark] '''Done'''
* Friday, July 6, 16.30 - 17.00 UTC (10.30am - 11.00 PDT): Switch to new Swift Container Listing (SSDs) [Ben] '''Done'''
* Monday, July 9, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Monday, July 9, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki 1.20wmf7 to test, test2, mediawiki.org] [Sam / Aaron]
* Monday, July 9, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur, Max]
* <s>Tuesday, July 10, 09:00-10:00 UTC (02:00am-03:00am PDT): Internationalization bug fixes/updates [Niklas / Sam]</s> '''DC Hackathon'''
* Tuesday, July 10, 17:30-19:30 UTC (10:30am-12:30pm PDT): ArticleFeedbackv5 updates [Roan] '''DC Hackathon'''
* Tuesday, July 10, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend /Deployments]]) [Patrick, Arthur, Max]
* <s>Wednesday, July 11, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]</s> '''DC Hackathon'''
* Wednesday, July 11, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki 1.20wmf7 to non-Wikipedia sites] [Sam / Aaron]
* Thursday, July 12, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* <s> Thursday, July 12, 18:00-20:00 UTC (11am-1pm PDT): E2 deployment [Benny / Kaldari] </s> '''Wikimania'''
* <s>Thursday, July 12, 20:00-22:00 UTC (1pm-3pm PDT): E3 deployments [Kaldari / Ori / Alolita]</s> '''Wikimania'''
* Monday, July 16, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki 1.20wmf7 to enwiki [Sam / Aaron / RobLa ]
* Tuesday, July 17, 09:00-10:00 UTC (02:00am-03:00am PDT): Internationalization bug fixes/updates [Niklas / Sam / Alolita]
* Tuesday, July 17, 17:30-19:30 UTC (10:30am-12:30pm PDT): ArticleFeedbackv5 updates [Roan]
* Wednesday, July 18, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Wednesday, July 18, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki 1.20wmf7 to *.wikipedia.org] [Sam / Aaron / RobLa]
* Thursday, July 19, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Thursday, July 19, 18:00-20:00 UTC (11am-1pm PDT): E2 deployment [Benny / Kaldari]
* Thursday, July 20, 20:00-22:00 UTC (1pm-3pm PDT): E3 deployments [Kaldari / Ori / Alolita]
* Friday, July 20, 14:00-15:00 UTC (7am - 8am PDT) : Deploy Hydrogen@eqiad - DNS resolver & SMTP server [Mark]
* Friday, July 20, 18:00-19:00 UTC (11am-12pm PDT): Gerrit DB Migration [Asher / Ryan]
* Monday, July 23 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki 1.20wmf8 to mediawiki.org, test, test2] [Sam / Aaron / RobLa ]
* Monday, July 23 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur, Max]
* Tuesday, July 24 09:00-10:00 UTC (02:00am-03:00am PDT): Internationalization bug fixes/updates [Niklas / Sam / Alolita]
* Tuesday July 24 17:30-19:30 UTC (10:30am-12:30pm PDT): ArticleFeedbackv5 updates [Matthias/Roan]
* Tuesday, July 24 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur, Max]
* Wednesday July 25 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Wednesday July 25 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki 1.20wmf8 to all non-Wikipedia wikis] [Sam / Aaron / RobLa]
* Thursday July 26 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* <s> Thursday, July 26 18:00-20:00 UTC (11am-1pm PDT): E2 deployment [Benny / Kaldari] - </s>
* Thursday, July 26 20:00-22:00 UTC (1pm-3pm PDT): E3 deployments [Kaldari / Ori / Alolita]
* Mon - Friday : Labs project instance migration [Ryan/Faidon]
* Ongoing - Apache-on-precise test deployment
* Monday, July 30, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] [Sam / Aaron / RobLa]
* Monday, July 30, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur, Max]
* <s> Tuesday, July 31, 09:00-10:00 UTC (02:00am-03:00am PDT): Internationalization bug fixes/updates </s> [Niklas / Sam / Alolita]
* Tuesday, July 31, 16:00-17:00 UTC (9am-10am PDT): Swift 404 handler change [Ben, Aaron] '''Done'''
* Tuesday, July 31, 17:30-19:30 UTC (10:30am-12:30pm PDT): ArticleFeedbackv5 updates [Matthias]
* Tuesday, July 31 21:00-22:00 UTC (2pm-3pm PDT): Timed Media Handler [Aaron /Jan / Michael Dale]
* Tuesday, July 31, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur, Max]
== August ==
* Wednesday, August 1, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Wednesday, August 1, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] [Sam / Aaron / RobLa]
* <s>Wednesday, August 1, 20:00-23:00 UTC (1pm-4pm PDT): Echo deployment on MediaWiki [Andrew]</s>
* Wednesday, August 1, 22:00-23:00 UTC (3pm-4pm PDT): MultiWrite support to test*, mediawiki.org [Aaron]
* Wednesday, August 1, 22:00-23:00 UTC (3pm-4pm PDT): Labs GlusterFS project storage upgrade [Ryan Lane]
* Thursday, August 2, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Thursday, August 2, 18:00-20:00 UTC (11am-1pm PDT): Echo deployment on MediaWiki [Andrew]
* Thursday, August 2, 20:00-22:00 UTC (1pm-3pm PDT): E3 deployments [Kaldari / Ori / Alolita]
* Thursday, August 2, 22:00-23:00 UTC (3pm-4pm PDT): MultiWrite to everything [Aaron / RobLa]
* Monday, Aug 6, 12:00-17:00 UTC (5am - 10am PDT) - Cabling and settingup new router@ ESAMS to replace current Foundary router. Possible impact to ESAMS traffic.
* Monday, August 6, 17:00-18:00 UTC (10am-11am PDT) - Adding Swift to MultiWrite config (reads from NFS, writes to both) test*, mediawiki.org [Aaron / RobLa]
* Monday, August 6, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] [Sam / Aaron / RobLa ]
* Tuesday, August 7, 09:00-10:00 UTC (02:00am-03:00am PDT): Internationalization bug fixes/updates [Niklas / Sam / Alolita]
* <s>Tuesday, August 7, 17:30-19:30 UTC (10:30am-12:30pm PDT): ArticleFeedbackv5 updates [Matthias]</s>
* Tuesday, August 7, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick]
* Wednesday, August 8, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Wednesday, August 8, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] [Sam / Aaron / RobLa]
* Wednesday, August 8, 20:00-22:00 UTC (1pm-3pm PDT): CentralNotice deployment [Kaldari]
* Thursday, August 9, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Thursday, August 9, 18:00-20:00 UTC (11am-1pm PDT): E2 deployment [Benny / Kaldari]
* Thursday, August 9, 20:00-22:00 UTC (1pm-3pm PDT): E3 deployments [Kaldari / Ori / Alolita]
* Thursday, August 9 22:00-00:00 UTC (3pm-5pm PDT) - Adding Swift to MultiWrite config (reads from NFS, writes to both) everything else [Aaron / RobLa]
* Ongoing: Rolling update from Swift 1.4.3 to 1.5
* Monday, August 13, 17:00-18:00 UTC (10am-11am PDT) - Swift default read (reads from Swift, writes to both) test*, mediawiki.org
* Monday, August 13, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] [Sam / Aaron / RobLa]
* Tuesday, August 14, 09:00-10:00 UTC (02:00am-03:00am PDT): Internationalization bug fixes/updates [Niklas / Sam / Alolita]
* Tuesday, August 14, 17:30-19:30 UTC (10:30am-12:30pm PDT): E2 & AFTv5 deployment (patches) [Kaldari]
* Tuesday, August 14, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur, Max]
* Tuesday, August 14 23:30-01:30 (next day) (4:30pm-6:30pm PDT) - Lua deployment to test2 [Tim] - depends on [https://rt.wikimedia.org/Ticket/Display.html?id=3365 RT 3365]
* Wednesday, August 15, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Wednesday, August 15, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] [Sam / Aaron / RobLa]
* Thursday, August 16, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Thursday, August 16, 18:00-20:00 UTC (11am-1pm PDT) - Swift default read (reads from Swift, writes to both) everything else, plus Apache rewrite rules [Aaron / Ben / RobLa]
* Thursday, August 16, 20:00-22:00 UTC (1pm-3pm PDT): E3 deployments [Kaldari / Ori / Alolita]
* Thursday, August 16, 22:00-00:00 UTC (3pm-5pm PDT) - E2 deployment [Benny / Kaldari]
* Monday August 20, 16:00-18:00 UTC (9am-11am PDT): Attempt #1 (failed): Squid configration to point to Swift for originals [Ben/Aaron/Faidon]
* Monday, August 20, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] [Sam / Aaron / RobLa]
* Monday, August 20, 22:00-23:00 UTC (3pm-4:30pm PDT): UploadWizard deployment [Kaldari]
* Tuesday, August 21, 09:00-10:00 UTC (02:00am-03:00am PDT): Internationalization bug fixes/updates [Niklas / Sam / Alolita]
* Tuesday, August 21, 16:00-18:00 UTC (9am-11am PDT), 2 hours needed - upgrade production swift cluster to swift version 1.5.0-3 - [Ben/Faidon/Ariel]
* <s>Tuesday, August 21, 17:30-19:30 UTC (10:30am-12:30pm PDT): ArticleFeedbackv5 updates [Matthias / Roan]</s>
* Tuesday, August 21, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur, Max]
* Tuesday, August 21, 23:30-01:30 (next day) (4:30pm-6:30pm PDT) - Category collation on Portuguese Wikipedia {{bug|35632}} [Tim]
* <s>Wednesday, August 22, 14:00 - 15:00, (7am - 8am PDT) - Tampa NFS /home from nfs1/nfs2 DRBD cluster to the NetApp.</s> [Mark] ''Deployments, fenari logins and general access to /home including MediaWiki configuration will become unavailable during this maintenance window.''
* Wednesday, August 22, 16:00-17:00 UTC (9am-10am PDT): Squid configration to point to Swift for originals [Ben/Aaron/Faidon]
* Wednesday, August 22, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Wednesday, August 22, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] [Sam / Aaron / RobLa]
* Thursday, August 23, 16:00-17:00 UTC (9am-10am PDT): Squid configration to point to Swift for originals [Ben/Aaron/Faidon]
* Thursday, August 23, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Thursday, August 23, 18:00-20:00 UTC (11am-1pm PDT): E2 deployment [Benny / Kaldari] (will be used for AFTv5 updates)
* Thursday, August 23, 20:00-22:00 UTC (1pm-3pm PDT): E3 deployments [Kaldari / Ori / Alolita]
* Friday, August 24, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Monday, August 27, 16:00-18:00 UTC (9am-11am PDT): Squid configuration switch from NFS to Swift [Ben/Aaron/Faidon/Ariel] '''[done]'''
* Monday, August 27, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] [Sam / Aaron / RobLa]
* Tuesday, August 28, 09:00-10:00 UTC (02:00am-03:00am PDT): Internationalization bug fixes/updates [Niklas / Sam / Alolita]
* <strike>Tuesday, August 28, 16:00-18:00 UTC (9am-11am PDT): Squid configuration switch from NFS to Swift [Ben/Aaron/Faidon/Ariel]</strike> cancelled
* Tuesday, August 28, 18:00-19:30 UTC (11:00am-12:30pm PDT): ArticleFeedbackv5 & PageCuration updates [Matthias / Benny]
* Tuesday, August 28, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur, Max]
* Wednesday, August 29, August 29, 13:00 - 14:00 UTC, (7am - 8am PDT) - Tampa NFS /home from nfs1/nfs2 DRBD cluster to the NetApp. [Mark] ''Deployments, fenari logins and general access to /home including MediaWiki configuration will become unavailable during this maintenance window.''
* Wednesday, August 29, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Wednesday, August 29, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] [Sam / Aaron / RobLa]
* Wednesday, August 29(?) - upgrade production swift cluster to swift version 1.5.0-3 - [Aaron/Faidon/Ariel]
* Thursday, August 30, Friday 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Thursday, August 30, 18:00-20:00 UTC (11am-1pm PDT): UploadWizard deployment [Benny / Kaldari]
* Thursday, August 30, 20:00-22:00 UTC (1pm-3pm PDT): E3 deployments [Kaldari / Ori / Alolita]
* Thursday, August 30, 22:00-23:00 UTC (3pm-4pm PDT): WLM-related stuff [Max] - done
== September ==
* Monday, Sept 3, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] [Sam / Aaron / RobLa]
* Tuesday, Sept 4, 09:00-10:00 UTC (02:00am-03:00am PDT): Internationalization bug fixes/updates [Niklas / Sam / Alolita]
* Tuesday, Sept 4, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur, Max]
* Wednesday, Sept 5, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Wednesday, Sept 5, 17:00-18:00 UTC (10am-11am PDT): Enabling CORS for the API (bug 20814) [Roan]
* Wednesday, Sept 5, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] [Sam / Aaron / RobLa]
* Wednesday, Sept 5 Time - tbd Limited Upload Varnish deployment/testing @ Eqiad
* Thursday, Sept 6, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Thursday, Sept 6, 18:00-20:00 UTC (11am-1pm PDT): E2 deployment [Benny / Kaldari]
* Thursday, Sept 6, 20:00-22:00 UTC (1pm-3pm PDT): E3 deployments [ Ori / S Page ]
* Friday, Sept 7 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Monday, Sep 10, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] [Sam / Aaron / RobLa]
* Monday, Sep 10, 20:00-21:00 UTC (1pm-2pm PDT): Deployment of WLM-related stuff [Max/Arthur]
* Tuesday, Sep 11, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur, Max]
* Wednesday, Sep 12, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] [Sam / Aaron / RobLa]
* Tuesday, Sep 12, 22:30-00:00 UTC (3:30pm-5pm PDT): Mobile deployment [Max]
* Tuesday, Sep 13, 1900-1930 (10am-1030am PDT): MF deployement (WLM banner enable) [Arthur]
* Monday, Sept 17, 18:00-20:00 UTC (11am-1pm PDT): Enable $wgHtml5 on all sites [https://bugzilla.wikimedia.org/show_bug.cgi?id=27478 bug 27478] - [Sam/RobLa] [DONE]
* Tuesday, Sep 18, 09:00-10:00 UTC (02:00am-03:00am PDT): Internationalization bug fixes/updates [Niklas / Sam / Alolita] [DONE]
* Monday-Friday Sept 17-20, 17:00 - 23:00 UTC (10am - 4pm): Apache server upgrade to Precise [Peter] (ongoing)
* Tuesday, Sep 18, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur, Max]
* Wednesday, Sep 19, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Wednesday, Sep 19, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] [Sam / Aaron / RobLa]
* Thursday, Sep 20, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Thursday, Sep 20, 18:00-20:00 UTC (11am-1pm PDT): E2 deployment: Page Curation Final + AFTv5 patches [Benny / Kaldari]
* Thursday, Sep 20, 20:00-22:00 UTC (1pm-3pm PDT): E3 deployments [ Ori / S Page ]
* Friday, Sep 21, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Monday-Friday Sept 24-28, 17:00 - 23:00 UTC (10am - 4pm): Apache server upgrade to Precise, including image scalers [Peter]
* Monday, Sep 24, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] + TMH backports [Sam / Aaron / RobLa]
* Tuesday, Sep 25, 09:00-10:00 UTC (02:00am-03:00am PDT): Internationalization bug fixes/updates [Niklas / Sam / Alolita]
* Tuesday, Sep 25, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Tuesday, Sep 25, 22:00-23:00 UTC (3pm-4pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur, Max]
* Wednesday, Sep 26, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Wednesday, Sep 26, 18:00-20:00 UTC (11am-1pm PDT): [http://www.mediawiki.org/wiki/MediaWiki_1.20/Roadmap#Schedule_for_the_deployments MediaWiki general deployment window] [Sam / Aaron / RobLa]
* Thursday, Sep 27, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Thursday, Sep 27, 18:00-20:00 UTC (11am-1pm PDT): E2 deployment: Page Curation patches [Benny / Kaldari]
* Thursday, Sep 27, 20:00-22:00 UTC (1pm-3pm PDT): E3 deployments / MicroDesign [Ori / S Page /Rob Moen]
* Friday, Sep 28, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
== October ==
* Monday, Oct 1, 18:00-20:00 UTC (11am-1pm PDT): [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki 1.21 general deployment window]] [Sam / Aaron / RobLa]
* Monday, Oct 1, 22:00-23:00 UTC (3pm-5pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur, Max]
* Tuesday, Oct 2, 09:00-10:00 UTC (02:00am-03:00am PDT): Internationalization bug fixes/updates [Niklas / Sam / Alolita]
* Tuesday, Oct 2, 17:30-19:30 UTC (10:30am-12:30pm PDT): ArticleFeedbackv5 updates [Matthias] (?)
* Tuesday, Oct 2, 22:00-23:00 UTC (3pm-5pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur, Max]
* Wednesday, Oct 3, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Wednesday, Oct 3, 18:00-20:00 UTC (11am-1pm PDT): [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki 1.21 general deployment window]] [Sam / Aaron / RobLa]
* Thursday, Oct 4, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Thursday, Oct 4, 18:00-20:00 UTC (11am-1pm PDT): E2 deployment [Benny / Kaldari] (?)
* Thursday, Oct 4, 20:00-22:00 UTC (1pm-3pm PDT): E3 deployments [ Ori / S Page ]
* Friday, Oct 5, 11:00-14:00 UTC (04:00am-07:00am PDT): Migrating images from ms7 to NetApp [Mark / Ariel / Reedy] (reverted, trying again on Tuesday(?))
* Friday, Oct 5, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Monday, Oct 8, 11:00-14:00 UTC (04:00am-07:00am PDT): Migrating images from ms7 to NetApp [Mark / Ariel / Reedy]
* Monday, Oct 8, 18:00-20:00 UTC (11am-1pm PDT): [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki 1.21 general deployment window]] [Sam / Aaron / RobLa]
* Tuesday, Oct 9, 09:00-10:00 UTC (02:00am-03:00am PDT): Internationalization bug fixes/updates [Niklas / Sam / Alolita]
* Tuesday, Oct 9, 17:30-19:30 UTC (10:30am-12:30pm PDT): ArticleFeedbackv5 updates [Matthias / Roan]
* Tuesday, Oct 9, 22:00-23:00 UTC (3pm-5pm PDT): Updated MobileFrontend code (see [[mw:Extension:MobileFrontend/Deployments|mw:Extension:MobileFrontend/Deployments]]) [Patrick, Arthur, Max]
* Wednesday, Oct 10, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Wednesday, Oct 10, 18:00-20:00 UTC (11am-1pm PDT): [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki 1.21 general deployment window]] [Sam / Aaron / RobLa]
* Thursday, Oct 11, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Thursday, Oct 11, 18:00-20:00 UTC (11am-1pm PDT): E2 deployment [Benny / Kaldari]
* Thursday, Oct 11, 20:00-22:00 UTC (1pm-3pm PDT): E3 deployments [ Ori / S Page ]
* Friday, Oct 12, 17:00-18:00 UTC (10am-11am PDT): Wikipedia Zero partner testing [Dan / Patrick]
* Friday, Oct 12, 20:00-21:00 UTC (1pm-2pm PDT): Gerrit 2.4.2-2 (current build + OpenStack patch) [Chad / Ryan]
* Monday, Oct 15, 18:00-20:00, 11 a.m. - 1 p.m., [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki general deployment window (1.21)]] ContentHandler branch [Sam / Aaron / RobLa]
* Monday, Oct 15, 22:00-0000, 3 p.m. - 5 p.m., [[mw:Extension:MobileFrontend/Deployments|MobileFrontend updates]] [Patrick / Arthur / Max]
* Tuesday, Oct 16, 09:00-10:00, 2:00 a.m. - 3:00 a.m., Internationalization bug fixes/updates [Niklas / Sam / Alolita]
* <s>Tuesday, Oct 16, 17:30-19:30, 10:30 a.m. - 12:30 p.m., ArticleFeedbackv5 updates [Matthias / Roan]</s>
* Tuesday, Oct 16, 22:00-0000, 3 p.m. - 5 p.m., [[mw:Extension:MobileFrontend/Deployments|MobileFrontend updates]] [Patrick / Arthur / Max]
* Wednesday, Oct 17, 17:00-18:00, 10 a.m. - 11 a.m., Wikipedia Zero partner testing [Dan / Patrick]
* Wednesday, Oct 17, 18:00-20:00, 11 a.m. - 1 p.m., [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki general deployment window (1.21)]] [Sam / Aaron / RobLa]
* Thursday, Oct 18, 17:00-18:00, 10 a.m. - 11 a.m., Wikipedia Zero partner testing [Dan / Patrick]
* <s>Thursday, Oct 18, 18:00-20:00, 11 a.m. - 1 p.m., E2 deployment [Benny / Kaldari]</s>
* Thursday, Oct 18, 20:00-22:00, 1 p.m. - 3 p.m., E3 deployments: ACUX, PEF [Kaldari / S Page / Ori]
{| class="wikitable"
! Day (UTC)
! Time (UTC)
! Time (PST)
! Description
|-
| Friday, Oct 19
| 17:00-18:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Monday, Oct 22
| 18:00-20:00
| 11 a.m. - 1 p.m.
| [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki 1.21wmf2 to commons, meta, and en.wikipedia]] [Sam / Aaron / RobLa]
|-
| Monday, Oct 22
| 17:00-20:00
| 10 a.m. - 1 p.m.
| Upgrade Varnish instances (upload@eqiad) with range-seeking feature & live with limited traffic [Mark]
|-
| Monday, Oct 22
| 16:00-18:00
| 9 a.m. - 11 a.m.
|Upgrade ms-fe@tampa to fix memory leak [Faidon]
|-
| Monday, Oct 22
| 22:00-0000
| 3 p.m. - 5 p.m.
| [[mw:Extension:MobileFrontend/Deployments|MobileFrontend updates]] [Patrick / Arthur / Max]
|-
| Tuesday, Oct 23
| 09:00-10:00
| 2:00 a.m. - 3:00 a.m.
| Internationalization bug fixes/updates [Niklas / Sam / Alolita]
|-
| Tuesday, Oct 23
| 17:30-19:30
| 10:30 a.m. - 12:30 p.m.
| <s>ArticleFeedbackv5 clicktracking updates [Matthias]</s> ''aborted''
|-
| Tuesday, Oct 23
| 22:00-0000
| 3 p.m. - 5 p.m.
| [[mw:Extension:MobileFrontend/Deployments|MobileFrontend updates]] [Patrick / Arthur / Max]
|-
| Wednesday, Oct 24
| 15:00-17:00
| 8 a.m. - 10 a.m.
| Wikidata.org test server deployment [Chad/Sam]
|-
| Wednesday, Oct 24
| 17:00-18:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Wednesday, Oct 24
| 18:00-20:00
| 11 a.m. - 1 p.m.
| [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|Mediawiki 1.21wmf2 to *.wikipedia.org]] [Sam / Aaron / RobLa]
|-
| Thursday, Oct 25
| 14:00-16:00
| 7 a.m. - 9 a.m.
| Gerrit/Jenkins to precise - [Chad / Antoine / Faidon]
|-
| Thursday, Oct 25
| 17:00-18:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Thursday, Oct 25
| 18:00-20:00
| 11 a.m. - 1 p.m.
| <s>Fundraising test [Kaldari / Peter / Katie]</s> ''aborted'' <br>[[mw:AFTv5|AFT]] event tracking updates [Matthias / Kaldari / Ori]
|-
| Thursday, Oct 25
| 20:00-22:00
| 1 p.m. - 3 p.m.
| E3 deployments - [[mw:Event logging|EventTracking]] [Ori]<br>[[mw:Post-edit_feedback|PEF (multiple countries)]] [Kaldari / S Page / Ori] <br>[[mw:ACUX|ACUX]] API, etc. [Kaldari / S Page]
|-
| Friday, Oct 26
| 17:00-18:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Monday, Oct 29
| 18:00-20:00
| 11 a.m. - 1 p.m.
| [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki general deployment window (1.21)]] [Sam / Aaron / RobLa]
|-
| Monday, Oct 29
| 22:00-0000
| 3 p.m. - 5 p.m.
| [[mw:Extension:MobileFrontend/Deployments|MobileFrontend updates]] [Patrick / Arthur / Max]
|-
| Monday, Oct 29
| 00:00-01:00
| 5 p.m. - 6 p.m.
| Re-enable [[mw:Extension:EventLogging|EventLogging]] for enwiki [Ori]
|-
| Tuesday, Oct 30
| 09:00-10:00
| 2:00 a.m. - 3:00 a.m.
| Internationalization bug fixes/updates [Niklas / Sam / Alolita]
|-
| Tuesday, Oct 30
| 17:30-19:30
| 10:30 a.m. - 12:30 p.m.
| ArticleFeedbackv5 updates [Matthias / Roan]
|-
| Tuesday, Oct 30
| 20:30-21:30
| 1:30 p.m. - 2:30 p.m.
| pecl-memcached for testwiki [Asher / Aaron]
|-
| Tuesday, Oct 30
| 22:00-0000
| 3 p.m. - 5 p.m.
| [[mw:Extension:MobileFrontend/Deployments|MobileFrontend updates]] [Patrick / Arthur / Max]
|-
| Wednesday, Oct 31
| 00:00-02:00
| 5 p.m. - 7p.m.
| CentralNotice - Buckets/Cache Change [Kaldari / Adam Wight / Matt Walker]
|-
| Wednesday, Oct 31
| 17:00-18:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Wednesday, Oct 31
| 18:00-20:00
| 11 a.m. - 1 p.m.
| [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki general deployment window (1.21)]] [Sam / Aaron / RobLa]
|}
== November ==
{| class="wikitable"
! Day (UTC)
! Time (UTC)
! Time (PST)
! Description
|-
| Thursday, Nov 1
| 16:00-17:00
| 9 a.m. - 10 a.m.
| Timed Media Handler to English Wikipedia (Jan/Aaron)
|-
| Thursday, Nov 1
| 17:00-18:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Thursday, Nov 1
| 18:00-20:00
| 11 a.m. - 1 p.m.
| E2 deployment [Benny / Kaldari]
|-
| Thursday, Nov 1
| 20:00-22:00
| 1 p.m. - 3 p.m.
| E3 deployments [Ori / S Page]
|-
| Thursday, Nov 1
| 22:00-24:00
| 3 p.m. - 5 p.m.
| CentralNotice Bug Fixes [Mwalker, Pgehres, Kaldari]
|-
| Friday, Nov 2
| 17:00-18:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Monday, Nov 5
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Timed Media Handler to everything except commons
|-
| Monday, Nov 5
| 19:00-21:00
| 11 a.m. - 1 p.m.
| [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki general deployment window (1.21)]] [Sam / Aaron / RobLa]
|-
| Monday, Nov 5
| 21:00-22:00
| 1:00 p.m. - 2:00 p.m.
| pecl-memcached ramp up [Asher / Aaron]
|-
| Monday, Nov 5
| 23:00-0100
| 3 p.m. - 5 p.m.
| [[mw:Extension:MobileFrontend/Deployments|MobileFrontend updates]] [Patrick / Arthur / Max]
|-
| Tuesday, Nov 6
| 09:00-10:00
| 1:00 a.m. - 2:00 a.m.
| Internationalization bug fixes/updates [Niklas / Sam / Alolita]
|-
| Tuesday, Nov 6
| 18:30-20:30
| 10:30 a.m. - 12:30 p.m.
| [[mw:AFTv5|ArticleFeedbackv5]] updates [Matthias]
|-
| Tuesday, Nov 6
| 23:00-0100
| 3 p.m. - 5 p.m.
| [[mw:Extension:MobileFrontend/Deployments|MobileFrontend updates]] [Patrick / Arthur / Max]
|-
| Wednesday, Nov 7
| 17:00-18:00
| 9 a.m. - 10 a.m.
| Timed Media Handler to commons
|-
| Wednesday, Nov 7
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Wednesday, Nov 7
| 19:00-21:00
| 11 a.m. - 1 p.m.
| [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki general deployment window (1.21)]] [Sam / Aaron / RobLa]
|-
| Thursday, Nov 8
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Thursday, Nov 8
| 19:00-21:00
| 11 a.m. - 1 p.m.
| E2 deployment: [[mw:Echo|Echo]] deploy on mediawiki.org [Benny / Kaldari]
|-
| Thursday, Nov 8
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Gallium upgrade [Faidon/Hashar]
|-
| Thursday, Nov 8
| 21:00-23:00
| 1 p.m. - 3 p.m.
| E3 deployments: [[mw:ACUX|ACUX with clientside validation]] & more [[mw:Post-edit_feedback|Post-edit]] [Ori / S Page]
|-
| Thursday, Nov 8
| 23:00-01:00
| 3 p.m. - 5 p.m.
| Fundraising: Fixing allocation bug #41862. Also X-domain data transfer using CN cookies. [Pgehres, Mwalker]
|-
| Friday, Nov 9
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Monday, Nov 12 (holiday)
| 19:00-21:00
| 11 a.m. - 1 p.m.
| [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki general deployment window (1.21)]] [Sam / Aaron / RobLa]
|-
| Tuesday, Nov 13
| 09:00-10:00
| 1:00 a.m. - 2:00 a.m.
| Internationalization bug fixes/updates [Niklas / Sam / Alolita]
|-
| <s>Tuesday, Nov 13</s>
| <s>18:30-20:30</s>
| <s>10:30 a.m. - 12:30 p.m.</s>
| <s>[[mw:AFTv5|ArticleFeedbackv5]] updates [Matthias]</s>
|-
| Tuesday, Nov 13
| 23:00-0100
| 3 p.m. - 5 p.m.
| [[mw:Extension:MobileFrontend/Deployments|MobileFrontend updates]] [Patrick / Arthur / Max]
|-
| Wednesday, Nov 14
| 19:00-21:00
| 11 a.m. - 1 p.m.
| [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki general deployment window (1.21)]] [Sam / Aaron / RobLa]
|-
| Wednesday, Nov 14
| 21:00-22:00
| 1 p.m. - 2 p.m.
| Fundraising [Peter]
|-
| Wednesday, Nov 14
| 22:00-23:00
| 2 p.m. - 3 p.m.
| [[mw:VisualEditor|VE]] dark launch [Roan]
|-
| <s>Thursday, Nov 15</s>
| <s>19:00-21:00</s>
| <s>11 a.m. - 1 p.m.</s>
| <s>[[mw:Article_feedback/Version_5|AFTv5]] deployment [Matthias]</s><br />
delayed to next Tues
|-
| <s>Thursday, Nov 15</s>
| <s>21:00-23:00</s>
| <s>1 p.m. - 3 p.m.</s>
| <s>E3 deployments [Ori / S Page]</s><br />
Moved to not interfere with FR-tech testing
|-
| Monday, Nov 19
| 19:00-21:00
| 11 a.m. - 1 p.m.
| [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki general deployment window (1.21)]] [Sam / Aaron / RobLa]
|-
| Monday, Nov 19
| 23:00-01:00
| 3 p.m. - 5 p.m.
| [[mw:Extension:MobileFrontend/Deployments|MobileFrontend updates]] [Patrick / Arthur / Max]
|-
| Tuesday, Nov 20
| 09:00-10:00
| 1:00 a.m. - 2:00 a.m.
| Internationalization bug fixes/updates [Niklas / Sam / Alolita]
|-
| Tuesday, Nov 20
| 18:30-20:30
| 10:30 a.m. - 12:30 p.m.
| <s>[[mw:AFTv5|ArticleFeedbackv5]] updates [Matthias]</s><br />
[[mw:Extension:UploadWizard|UploadWizard]] Flickr support turned on [Kaldari] (deployed, turned off)
|-
| Tuesday, Nov 20
| 21:00-23:00
| 1 p.m. - 3 p.m.
| [[mw:Extension:MobileFrontend/Deployments|MobileFrontend updates]] [Patrick / Arthur / Max]
|-
| Wednesday, Nov 21
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Wednesday, Nov 21
| 19:00-21:00
| 11 a.m. - 1 p.m.
| [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki general deployment window (1.21)]] [Sam / Aaron / RobLa]
|-
| Thursday, Nov 22
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| <s>Thursday, Nov 22 </s>
| <s>19:00-21:00</s>
| <s>11 a.m. - 1 p.m.</s>
| <s>E2 deployment [Benny / Kaldari] </s>
|-
| Thursday, Nov 22
| 21:00-23:00
| 1 p.m. - 3 p.m.
| PEF to sv/pt wikis [Ori / S Page]
|-
| Friday, Nov 23
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Monday, Nov 26
| 19:00-21:00
| 11 a.m. - 1 p.m.
| [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki 1.21wmf5 to test, test2, mediawiki.org]] plus deployment of TemplateSandbox [Sam / Chad / Chris S / RobLa]
|-
| Tues-Thurs
| 16:00-24:00
| 8.00 am - 4pm
| Replacing Tampa Swift servers one at a time
|-
| Tuesday, Nov 27
| 09:00-10:00
| 1:00 a.m. - 2:00 a.m.
| Internationalization bug fixes/updates [Niklas / Sam / Alolita]
|-
| <s>Tuesday, Nov 27</s>
| <s>18:30-20:30</s>
| <s>10:30 a.m. - 12:30 p.m.</s>
| <s>[[mw:AFTv5|ArticleFeedbackv5]] updates [Matthias]</s>
|-
| Tuesday, Nov 27
| 21:00-23:00
| 1 p.m. - 3 p.m.
| [[mw:Extension:MobileFrontend/Deployments|MobileFrontend updates]] [Patrick / Arthur / Max]
|-
| Tuesday, Nov 27
| 23:00-0100
| 3 p.m. - 5 p.m.
| CentralNotice Hide Banners without Messing up Stats [Pgerhes, Mwalker]
|-
| Wednesday, Nov 28
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Wednesday, Nov 28
| 19:00-21:00
| 11 a.m. - 1 p.m.
| [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki general deployment window (1.21)]] [Sam / Aaron / RobLa]
|-
| Thursday, Nov 29
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Thursday, Nov 29
| 19:00-21:00
| 11 a.m. - 1 p.m.
| Upload Wizard deployment - E2 deployment [Kaldari]
|-
| Thursday, Nov 29
| 21:00-23:00
| 1 p.m. - 3 p.m.
| E3 deployments [Ori / S Page]
|-
| Friday, Nov 30
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|}
== Dec ==
{| class="wikitable"
! Day (UTC)
! Time (UTC)
! Time (PST)
! Description
|-
| Monday, Dec 3
| 19:00-21:00
| 11 a.m. - 1 p.m.
| Deploy [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki 1.21wmf5]] to enwiki and itwikisource; deploy TemplateSandbox to enwiki; update Wikibase and related extensions on Wikidata.org; deploy Wikibase Client to test2wiki [Sam / Aaron / Chad / RobLa]
|-
| Tues, Dec 4 (UTC)
| 00:30-02:00
| 4:30 p.m. - 6:00 p.m.
| Deploy CentralNotice: PGehres, AWight
|-
| Tues, Dec 4 (UTC)
| 01:15-02:00
| 5:15 p.m. - 6:00 p.m.
| sync-dir Extension:E3Experiments changes to wmf5: spage (CentralNotice finished early)
|-
|Tues-Thurs, Dec 4-Dec 6
|10:00-20:00pm
|2:00-12noon
|draining traffic and replacing out 2 SwiftServers@Tampa
|-
| Tuesday, Dec 4
| 09:00-10:00
| 1:00 a.m. - 2:00 a.m.
| Internationalization bug fixes/updates [Niklas / Sam / Alolita]
|-
| Tuesday, Dec 4
| 18:30-20:30
| 10:30 a.m. - 12:30 p.m.
| [[mw:AFTv5|ArticleFeedbackv5]] updates [https://de.wikipedia.org/wiki/Wikipedia:Artikel-Feedback dewiki] [Matthias]
|-
| Tuesday, Dec 4
| 21:00-23:00
| 1 p.m. - 3 p.m.
| [[mw:Extension:MobileFrontend/Deployments|MobileFrontend updates]] [Arthur / Max]
|-
| Tuesday, Dec 4
| 23:00-00:00
| 3 p.m. - 4 p.m.
| Copy Iabda8155 and Id4a2058f to 1.21wmf4, to hopefully avoid a repeat of CSS breakage tomorrow [Anomie]
|-
| Wednesday, Dec 5
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Wednesday, Dec 5
| 19:00-21:00
| 11 a.m. - 1 p.m.
| [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki general deployment window (1.21)]] [Sam / Aaron / RobLa]
|-
| Wednesday, Dec 5
| 22:00 - 00:00 (Dec 6)
| 2:00 p.m. - 4:00 p.m.
| MobileFrontend bug fix deployment [Arthur/Max]
|-
| Thursday, Dec 6
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Thursday, Dec 6
| 19:00-21:00
| 11 a.m. - 1 p.m.
| UploadWizard deployment [Kaldari]
|-
| Thursday, Dec 6
| 21:00-23:00
| 1 p.m. - 3 p.m.
| E3 deployment: [[mw:Extension:EventLogging|EventLogging]] update [Ori Livneh]
|-
| Thursday, Dec 6
| 23:00-00:00 (Dec 7)
| 3 p.m. - 4 p.m.
| MobileFrontend bug fix deployment [Arthur/Max]
|-
| Friday, Dec 7
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Friday, Dec 7
| 20:00-22:00
| 11 a.m. - 1 p.m.
| E3 deployments: [[mw:Extension:E3Experiments|E3Experiments]] update [Ori / S Page]
|-
| Monday, Dec 10
| 19:00-21:00
| 3 p.m. - 4 p.m.
| [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki general deployment window (1.21)]] [Sam / Aaron / RobLa] -- including WikidataClient update on test2 and [[mw:MediaWiki 1.21/wmf6]]
|-
| Monday, Dec 10
| 23:00-0:00
| 11 a.m. - 1 p.m.
| Deployment of maintenance notice for ContributionTracking in preparation for fundraising maint window at 01:00 UTC [PeterG]
|-
|Tues-Thurs, Dec 4-Dec 6
|10:00-20:00pm
|2:00-12noon
|draining traffic and replacing out 2 SwiftServers@Tampa
|-
| Tuesday, Dec 11
| <s>09:00-10:00</s>
| <s>1:00 a.m. - 2:00 a.m.</s>
| <s>Internationalization bug fixes/updates [Niklas / Sam / Alolita]</s>
|-
| Tuesday, Dec 11
| 13:00-15:00
| 5:00 a.m. - 7:00 a.m.
| GeoData switch to Solr [Max]
|-
| Tuesday, Dec 11
| 21:00-23:00
| 1 p.m. - 3 p.m.
| [[mw:Extension:MobileFrontend/Deployments|MobileFrontend updates]] [Arthur / Max]
|-
| Tuesday, Dec 11
| 23:00 - 01:00
| 3 p.m. - 5 p.m.
| [[mw:VisualEditor|VisualEditor]] limited opt-in production launch - English Wikipedia [Roan / JamesF]
|-
| Wednesday, Dec 12
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Wednesday, Dec 12
| 19:00-21:00
| 11 a.m. - 1 p.m.
| [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki general deployment window (1.21)]] [Sam / Aaron / RobLa] [[mw:MediaWiki 1.21/wmf6]]
|-
| Wednesday
| 24:00-0:30
| 4 p.m. - 4:30 p.m.
| [[Lightning_deployments|Lightning deploy time]]
|-
| Thursday, Dec 13
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Thursday, Dec 13
| 19:00-21:00
| 11 a.m. - 1 p.m.
| [[mw:AFTv5|ArticleFeedbackv5]] updates. dewiki pilot AFTv5 deploy (~10k articles) [Matthias]
|-
| Thursday, Dec 13
| 21:00-23:00
| 1 p.m. - 3 p.m.
| E3 deployment: [[mw:Onboarding_new_Wikipedians|Onboarding]] [S Page / Ori / MarkTraceur / MattF]
|-
| Friday, Dec 14
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Monday, Dec 17
| 19:00-21:00
| 11 a.m. - 1 p.m.
| [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki general deployment window (1.21)]] [Sam / Aaron / RobLa]
|-
| Monday, Dec 17
| 22:00-23:30
| 2:00 p.m. - 3:30 p.m.
| fenari distribution upgrade [Peter]
|-
| Monday, Dec 17
| 0:00-0:30
| 4 p.m. - 4:30 p.m.
| [[Lightning_deployments|Lightning deploy time]] (proposed)
|-
| <S> Tuesday, Dec 18 </s>
| <s> 09:00-10:00 </s>
| <S>1:00 a.m. - 2:00 a.m. </s>
| <s>Internationalization bug fixes/updates [Niklas / Sam / Alolita]</s>
|-
| Tuesday, Dec 18
| 18:30-20:30
| 10:30 a.m. - 12:30 p.m.
| [[mw:Echo|Echo]] deployment to mediawiki (for realz this time) [Benny / Kaldari]
|-
| Tuesday, Dec 18
| 21:00-23:00
| 1 p.m. - 3 p.m.
| [[mw:Extension:MobileFrontend/Deployments|MobileFrontend updates]] [Arthur / Max]
|-
| Tuesday, Dec 18
| 0:00-0:30
| 4 p.m. - 4:30 p.m.
| [[Lightning_deployments|Lightning deploy time]] (proposed)
|-
| Wednesday, Dec 19
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Wednesday, Dec 19
| 19:00-21:00
| 11 a.m. - 1 p.m.
| [[mw:MediaWiki 1.21/Roadmap#Schedule for the deployments|MediaWiki general deployment window (1.21)]] [Sam / Aaron / RobLa]
|-
| Wednesday, Dec 19
| 0:00-0:30
| 4 p.m. - 4:30 p.m.
| [[Lightning_deployments|Lightning deploy time]] [Benny / Kaldari]
|-
| Thursday, Dec 20
| 18:00-19:00
| 10 a.m. - 11 a.m.
| Wikipedia Zero partner testing [Dan / Patrick]
|-
| Thursday, Dec 20
| 19:00-21:00
| 11 a.m. - 1 p.m.
| E2 deployment [Benny / Kaldari]
|-
| Thursday, Dec 20
| 21:00-23:00
| 1 p.m. - 3 p.m.
| E3 deployments [Ori / S Page]
|-
| Thursday, Dec 20
| 0:00-0:30
| 4 p.m. - 4:30 p.m.
| [[Lightning_deployments|Lightning deploy time]] (proposed)
|-
| Wednesday, Dec 26
| 21:00-23:00
| 1 p.m. - 3 p.m.
| [[mw:Extension:MobileFrontend/Deployments|MobileFrontend updates]] [Arthur / Max]
|-
| Wednesday, Dec 26
| 0:00-0:30
| 4 p.m. - 4:30 p.m.
| [[Lightning_deployments|Lightning deploy time]] (proposed)
|-
| Thursday, Dec 27
| 19:00-21:00
| 11 a.m. - 1 p.m.
| E2 deployment (tentative) [Benny / Kaldari]
|-
| Thursday, Dec 27
| 21:00-23:00
| 1 p.m. - 3 p.m.
| E3 deployments [Ori / S Page]
|-
| Thursday, Dec 27
| 0:00-0:30
| 4 p.m. - 4:30 p.m.
| [[Lightning_deployments|Lightning deploy time]] (proposed)
|-
| Monday, Dec 31
| 0:00-0:30
| 4 p.m. - 4:30 p.m.
| [[Lightning_deployments|Lightning deploy time]] (proposed)
6neopht1c1tlqzj3owt0v26uxxrvoei
Server Admin Log/Archives
0
4673
2396614
2385365
2026-03-29T08:15:59Z
Minorax
38339
2396614
wikitext
text/x-wiki
<noinclude>{{process header
|previous=← [[Server Admin Log]]
|title=Server Admin Log
|section=(archives)
}}</noinclude>
<inputbox>
type=fulltext
prefix=Server Admin Log/
searchbuttonlabel=Search archives
break=no
</inputbox><noinclude>
==Archives==
</noinclude>
===2000s===
<div style="column-count:2;-moz-column-count:2;-webkit-column-count:2">
* [[Server Admin Log/Archive 1|Archive 1: 2004 Jun - 2004 Sep]]
* [[Server Admin Log/Archive 2|Archive 2: 2004 Oct - 2004 Nov]]
* [[Server Admin Log/Archive 3|Archive 3: 2004 Dec - 2005 Mar]]
* [[Server Admin Log/Archive 4|Archive 4: 2005 Apr - 2005 Jul]]
* [[Server Admin Log/Archive 5|Archive 5: 2005 Aug - 2005 Oct]], <small>with revision history 2004-06-23 to 2005-11-25</small>
* [[Server Admin Log/Archive 6|Archive 6: 2005 Nov - 2006 Feb]]
* [[Server Admin Log/Archive 7|Archive 7: 2006 Mar - 2006 Jun]]
* [[Server Admin Log/Archive 8|Archive 8: 2006 Jul - 2006 Sep]]
* [[Server Admin Log/Archive 9|Archive 9: 2006 Oct - 2007 Jan]], <small>with revision history 2005-11-25 to 2007-02-21</small>
* [[Server Admin Log/Archive 10|Archive 10: 2007 Feb - 2007 Jun]]
* [[Server Admin Log/Archive 11|Archive 11: 2007 Jul - 2007 Dec]]
* [[Server Admin Log/Archive 12|Archive 12: 2008 Jan - 2008 Jul]]
* [[Server Admin Log/2008-08|Archive 12a: 2008 Aug]]
* [[Server Admin Log/2008-09|Archive 12b: 2008 Sept]]
* [[Server Admin Log/Archive 13|Archive 13: 2008 Oct - 2009 Jun]]
* [[Server Admin Log/Archive 14|Archive 14: 2009 Jun - 2009 Dec]]
</div>
===2010s===
<div style="column-count:2;-moz-column-count:2;-webkit-column-count:2">
* [[Server Admin Log/Archive 15|Archive 15: 2010 Jan - 2010 Jun]]
* [[Server Admin Log/Archive 16|Archive 16: 2010 Jul - 2010 Oct]]
* [[Server Admin Log/Archive 17|Archive 17: 2010 Nov - 2010 Dec]]
* [[Server Admin Log/Archive 18|Archive 18: 2011 Jan - 2011 Jun]]
* [[Server Admin Log/Archive 19|Archive 19: 2011 Jul - 2011 Dec]]
* [[Server Admin Log/Archive 20|Archive 20: 2011 Dec - 2012 Jun]], <small>with revision history 2007-02-21 to 2012-03-27</small>
* [[Server Admin Log/Archive 21|Archive 21: 2012 Jul - 2013 Jan]]
* [[Server Admin Log/Archive 22|Archive 22: 2013 Jan - 2013 Jul]]
* [[Server Admin Log/Archive 23|Archive 23: 2013 Aug - 2013 Dec]]
* [[Server Admin Log/Archive 24|Archive 24: 2014 Jan - 2014 Mar]]
* [[Server Admin Log/Archive 25|Archive 25: 2014 April - 2014 September]]
* [[Server Admin Log/Archive 26|Archive 26: 2014 October - 2014 December]]
* [[Server Admin Log/Archive 27|Archive 27: 2015 January - 2015 July]]
* [[Server Admin Log/Archive 28|Archive 28: 2015 August - 2015 December]]
* [[Server Admin Log/Archive 29|Archive 29: 2016 January - 2016 May]]
* [[Server Admin Log/Archive 30|Archive 30: 2016 June - 2016 August]]
* [[Server Admin Log/Archive 31|Archive 31: 2016 September - 2016 December]]
* [[Server Admin Log/Archive 32|Archive 32: 2017 January - 2017 July]]
* [[Server Admin Log/Archive 33|Archive 33: 2017 August - 2017 December]]
* [[Server Admin Log/Archive 34|Archive 34: 2018 January - 2018 April]]
* [[Server Admin Log/Archive 35|Archive 35: 2018 May - 2018 August]]
* [[Server Admin Log/Archive 36|Archive 36: 2018 September - 2018 December]]
* [[Server Admin Log/Archive 37|Archive 37: 2019 January - 2019 April]]
* [[Server Admin Log/Archive 38|Archive 38: 2019 May - 2019 August]]
* [[Server Admin Log/Archive 39|Archive 39: 2019 September - 2019 December]]
</div>
===2020-2024===
<div style="column-count:2;-moz-column-count:2;-webkit-column-count:2">
* [[Server Admin Log/Archive 40|Archive 40: 2020 January - 2020 April]]
* [[Server Admin Log/Archive 41|Archive 41: 2020 May - 2020 July]]
* [[Server Admin Log/Archive 42|Archive 42: 2020 August - 2020 November]]
* [[Server Admin Log/Archive 43|Archive 43: 2020 December]]
* [[Server Admin Log/Archive 44|Archive 44: 2021 January - 2021 April]]
* [[Server Admin Log/Archive 45|Archive 45: 2021 May - 2021 July]]
* [[Server Admin Log/Archive 46|Archive 46: 2021 August - 2021 October]]
* [[Server Admin Log/Archive 47|Archive 47: 2021 November - 2021 December]]
* [[Server Admin Log/Archive 48|Archive 48: 2022 January]]
* [[Server Admin Log/Archive 49|Archive 49: 2022 February]]
* [[Server Admin Log/Archive 50|Archive 50: 2022 March]]
* [[Server Admin Log/Archive 51|Archive 51: 2022 April 1-15]]
* [[Server Admin Log/Archive 52|Archive 52: 2022 April 16-30]]
* [[Server Admin Log/Archive 53|Archive 53: 2022 May]]
* [[Server Admin Log/Archive 54|Archive 54: 2022 June]]
* [[Server Admin Log/Archive 55|Archive 55: 2022 July]]
* [[Server Admin Log/Archive 56|Archive 56: 2022 August]]
* [[Server Admin Log/Archive 57|Archive 57: 2022 September]]
* [[Server Admin Log/Archive 58|Archive 58: 2022 October]]
* [[Server Admin Log/Archive 59|Archive 59: 2022 November 1-15]]
* [[Server Admin Log/Archive 60|Archive 60: 2022 November 16-30]]
* [[Server Admin Log/Archive 61|Archive 61: 2022 December]]
* [[Server Admin Log/Archive 62|Archive 62: 2023 January]]
* [[Server Admin Log/Archive 63|Archive 63: 2023 February]]
* [[Server Admin Log/Archive 64|Archive 64: 2023 March]]
* [[Server Admin Log/Archive 65|Archive 65: 2023 April]]
* [[Server Admin Log/Archive 66|Archive 66: 2023 May]]
* [[Server Admin Log/Archive 67|Archive 67: 2023 June]]
* [[Server Admin Log/Archive 68|Archive 68: 2023 July]]
* [[Server Admin Log/Archive 69|Archive 69: 2023 August 1-15]]
* [[Server Admin Log/Archive 70|Archive 70: 2023 August 16-31]]
* [[Server Admin Log/Archive 71|Archive 71: 2023 September]]
* [[Server Admin Log/Archive 72|Archive 72: 2023 October]]
* [[Server Admin Log/Archive 73|Archive 73: 2023 November]]
* [[Server Admin Log/Archive 74|Archive 74: 2023 December]]
* [[Server Admin Log/Archive 75|Archive 75: 2024 January]]
* [[Server Admin Log/Archive 76|Archive 76: 2024 February]]
* [[Server Admin Log/Archive 77|Archive 77: 2024 March]]
* [[Server Admin Log/Archive 78|Archive 78: 2024 April]]
* [[Server Admin Log/Archive 79|Archive 79: 2024 May 1-15]]
* [[Server Admin Log/Archive 80|Archive 80: 2024 May 16-31]]
* [[Server Admin Log/Archive 81|Archive 81: 2024 June 1-15]]
* [[Server Admin Log/Archive 82|Archive 82: 2024 June 16-30]]
* [[Server Admin Log/Archive 83|Archive 83: 2024 July]]
* [[Server Admin Log/Archive 84|Archive 84: 2024 August]]
* [[Server Admin Log/Archive 85|Archive 85: 2024 September]]
* [[Server Admin Log/Archive 86|Archive 86: 2024 October]]
* [[Server Admin Log/Archive 87|Archive 87: 2024 November]]
* [[Server Admin Log/Archive 88|Archive 88: 2024 December]]
</div>
===2025-present===
<div style="column-count:2;-moz-column-count:2;-webkit-column-count:2">
* [[Server Admin Log/Archive 89|Archive 89: 2025 January]]
* [[Server Admin Log/Archive 90|Archive 90: 2025 February]]
* [[Server Admin Log/Archive 91|Archive 91: 2025 March]]
* [[Server Admin Log/Archive 92|Archive 92: 2025 April]]
* [[Server Admin Log/Archive 93|Archive 93: 2025 May]]
* [[Server Admin Log/Archive 94|Archive 94: 2025 June]]
* [[Server Admin Log/Archive 95|Archive 95: 2025 July]]
* [[Server Admin Log/Archive 96|Archive 96: 2025 August]]
* [[Server Admin Log/Archive 97|Archive 97: 2025 September]]
* [[Server Admin Log/Archive 98|Archive 98: 2025 October]]
* [[Server Admin Log/Archive 99|Archive 99: 2025 November]]
* [[Server Admin Log/Archive 100|Archive 100: 2025 December]]
* [[Server Admin Log/Archive 101|Archive 101: 2026 January]]
* [[Server Admin Log/Archive 102|Archive 102: 2026 February]]
</div>
<!-- omg! -->
<includeonly>
[[Category:Server Admin Log archive]]
</includeonly>
1y8w9vrp19hqqaom62wbhw9px48d71z
Server Admin Log
0
7919
2396603
2396589
2026-03-28T14:16:50Z
Stashbot
7414
mutante: releases1003 - re-enabled puppet which was disabled due to T418109 but should not have been disabled during switch of the deployment server; leading to T421532
2396603
wikitext
text/x-wiki
== 2026-03-28 ==
* 14:16 mutante: releases1003 - re-enabled puppet which was disabled due to [[phab:T418109|T418109]] but should not have been disabled during switch of the deployment server; leading to [[phab:T421532|T421532]]
== 2026-03-27 ==
* 18:11 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 18:00 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:50 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:40 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:39 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:39 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:39 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 17:38 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:37 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 17:37 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:35 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 17:34 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 17:34 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:30 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:30 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:24 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:19 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:15 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:04 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:55 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:50 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:47 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:42 dancy@deploy1003: Finished deploy [releng/jenkins-deploy@31ace7e] (releasing): (no justification provided) (duration: 01m 18s)
* 16:41 dancy@deploy1003: Started deploy [releng/jenkins-deploy@31ace7e] (releasing): (no justification provided)
* 16:37 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:36 jhathaway@cumin2002: END (ERROR) - Cookbook sre.hosts.provision (exit_code=97) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:27 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:22 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:13 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 16:12 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 16:12 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:11 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:10 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 15:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:00 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:09 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:09 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change ips for frack servers - cmooney@cumin1003"
* 14:08 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change ips for frack servers - cmooney@cumin1003"
* 14:02 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 13:52 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:51 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:49 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:49 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:48 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:47 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:47 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:40 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:11 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:08 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:06 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 11:53 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 11:51 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 11:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 11:48 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 11:30 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:27 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:15 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-test1006.eqiad.wmnet with OS trixie
* 11:15 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database abstractwiki ([[phab:T420637|T420637]])
* 11:02 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2002.codfw.wmnet
* 10:54 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2006.codfw.wmnet
* 10:51 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 10:50 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2006.codfw.wmnet
* 10:46 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2005.codfw.wmnet
* 10:43 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2005.codfw.wmnet
* 10:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1006.eqiad.wmnet with reason: host reimage
* 10:27 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1006.eqiad.wmnet with reason: host reimage
* 10:18 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database abstractwiki ([[phab:T420637|T420637]])
* 10:12 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1006.eqiad.wmnet with OS trixie
* 10:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 10:03 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:01 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:58 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:57 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
* 09:37 elukey@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
* 09:06 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 09:05 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:04 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:03 elukey@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 08:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:05 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 08:04 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 08:02 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 07:46 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 03:06 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:32 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:12 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 07s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:30 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 01:30 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul1001.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 01:30 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul2001.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 01:29 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 01:12 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
== 2026-03-26 ==
* 21:35 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1260834{{!}}Add Logstash logging for successful passwordless logins]], [[gerrit:1261511{{!}}InitialiseSettings: Remove apiportalwiki from $wmgCentralAuthAutoLoginWikis (T421413)]] (duration: 06m 58s)
* 21:31 reedy@deploy1003: catrope, reedy: Continuing with sync
* 21:30 reedy@deploy1003: catrope, reedy: Backport for [[gerrit:1260834{{!}}Add Logstash logging for successful passwordless logins]], [[gerrit:1261511{{!}}InitialiseSettings: Remove apiportalwiki from $wmgCentralAuthAutoLoginWikis (T421413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:28 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1260834{{!}}Add Logstash logging for successful passwordless logins]], [[gerrit:1261511{{!}}InitialiseSettings: Remove apiportalwiki from $wmgCentralAuthAutoLoginWikis (T421413)]]
* 21:00 suecarmol@deploy1003: Finished scap sync-world: Backport for [[gerrit:1256498{{!}}PersonalDashboard: Add config for Active Discussions (T420785)]] (duration: 13m 53s)
* 20:54 suecarmol@deploy1003: suecarmol: Continuing with sync
* 20:51 suecarmol@deploy1003: suecarmol: Backport for [[gerrit:1256498{{!}}PersonalDashboard: Add config for Active Discussions (T420785)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:46 suecarmol@deploy1003: Started scap sync-world: Backport for [[gerrit:1256498{{!}}PersonalDashboard: Add config for Active Discussions (T420785)]]
* 20:44 kamila@deploy1003: Finished scap sync-world: Backport for [[gerrit:1261545{{!}}Wrap 'centralauthtoken' in a JWT (T420280)]], [[gerrit:1261470{{!}}Enable $wgTempCategoryCollations for testwiki. (T419274 T419049)]] (duration: 37m 32s)
* 20:30 kamila@deploy1003: matmarex, kamila: Continuing with sync
* 20:25 kamila@deploy1003: matmarex, kamila: Backport for [[gerrit:1261545{{!}}Wrap 'centralauthtoken' in a JWT (T420280)]], [[gerrit:1261470{{!}}Enable $wgTempCategoryCollations for testwiki. (T419274 T419049)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host restbase2039.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:09 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host restbase2039.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>aqs1015.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 20:08 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host restbase2039
* 20:07 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host restbase2039
* 20:06 kamila@deploy1003: Started scap sync-world: Backport for [[gerrit:1261545{{!}}Wrap 'centralauthtoken' in a JWT (T420280)]], [[gerrit:1261470{{!}}Enable $wgTempCategoryCollations for testwiki. (T419274 T419049)]]
* 20:05 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:05 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding restbase2039 to codfw - jhancock@cumin2002"
* 20:05 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding restbase2039 to codfw - jhancock@cumin2002"
* 20:02 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 20:00 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>aqs1015.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:47 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 19:44 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 18:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:48 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:42 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
* 18:42 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/zotero: apply
* 18:42 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
* 18:41 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
* 18:41 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
* 18:40 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/toolhub: apply
* 18:40 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
* 18:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:39 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/termbox: apply
* 18:39 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 18:37 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 18:37 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 18:37 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 18:36 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 18:36 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:36 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 18:36 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
* 18:35 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
* 18:35 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/recommendation-api: apply
* 18:35 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/recommendation-api: apply
* 18:34 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
* 18:34 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
* 18:33 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/push-notifications: apply
* 18:33 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/push-notifications: apply
* 18:32 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 18:31 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 18:31 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 18:28 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 18:28 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:28 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:27 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
* 18:25 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>sessionstore1006.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 18:21 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
* 18:21 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 18:20 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 18:20 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
* 18:19 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/ipoid: apply
* 18:19 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/image-suggestion: apply
* 18:19 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/image-suggestion: apply
* 18:18 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>sessionstore1006.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 18:18 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
* 18:17 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
* 18:17 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
* 18:16 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
* 18:15 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 18:14 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 18:13 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 18:13 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 18:13 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
* 18:12 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
* 18:12 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:12 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
* 18:10 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
* 18:09 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
* 18:09 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/echostore: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/echostore: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 18:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 18:06 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/commons-impact-analytics: apply
* 18:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/commons-impact-analytics: apply
* 18:04 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 18:04 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 18:03 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 18:03 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 18:03 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
* 18:02 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
* 17:59 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/apertium: apply
* 17:58 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/apertium: apply
* 17:55 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to enable envoy drain on remaining services - [[phab:T364245|T364245]] (duration: 05m 31s)
* 17:52 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to enable envoy drain on remaining services - [[phab:T364245|T364245]]
* 17:35 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:39 rzl@deploy1003: Finished scap sync-world: https://gerrit.wikimedia.org/r/1256396 [[phab:T420666|T420666]] (duration: 11m 21s)
* 16:35 rzl@deploy1003: rzl: Continuing with sync
* 16:34 rzl@deploy1003: rzl: https://gerrit.wikimedia.org/r/1256396 [[phab:T420666|T420666]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:31 rzl@deploy1003: Started scap sync-world: https://gerrit.wikimedia.org/r/1256396 [[phab:T420666|T420666]]
* 16:27 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 16:17 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 16:17 blake@deploy1003: Finished scap sync-world: Test deployment to validate deployment server switchover - [[phab:T413974|T413974]] (duration: 31m 09s)
* 16:16 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 16:05 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 15:47 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1202.eqiad.wmnet onto db1253.eqiad.wmnet
* 15:47 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1253: Pool db1253.eqiad.wmnet in after cloning
* 15:46 blake@deploy1003: Started scap sync-world: Test deployment to validate deployment server switchover - [[phab:T413974|T413974]]
* 15:44 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 15:43 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 15:43 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 15:33 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 15:30 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
* 15:30 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
* 15:23 blake@dns1004: END - running authdns-update
* 15:22 bjensen: updating dns for the deployment host switchover
* 15:21 blake@dns1004: START - running authdns-update
* 15:19 blake@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet,releases1003.eqiad.wmnet with reason: Deployment server switchover
* 15:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:01 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1253: Pool db1253.eqiad.wmnet in after cloning
* 14:39 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:28 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:23 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:22 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:21 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:21 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:21 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:20 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 14:20 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 14:20 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 14:19 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 14:18 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 14:18 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 14:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1202: Pool db1202.eqiad.wmnet in after cloning
* 13:57 jynus: dropping ms-backup[12]00[12] grants from backup1-* dbs [[phab:T420464|T420464]]
* 13:56 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1070.eqiad.wmnet
* 13:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1070.eqiad.wmnet
* 13:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1097.eqiad.wmnet
* 13:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1097.eqiad.wmnet
* 13:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1055.eqiad.wmnet
* 13:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1055.eqiad.wmnet
* 13:46 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:45 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:40 sergi0: UTC afternoon backport window done
* 13:39 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259132{{!}}GrowthExperiments: scale edit and thanks query limit to more wikis (T341599)]] (duration: 09m 17s)
* 13:35 sgimeno@deploy2002: sgimeno: Continuing with sync
* 13:32 sgimeno@deploy2002: sgimeno: Backport for [[gerrit:1259132{{!}}GrowthExperiments: scale edit and thanks query limit to more wikis (T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1259132{{!}}GrowthExperiments: scale edit and thanks query limit to more wikis (T341599)]]
* 13:26 jforrester@deploy2002: Finished deploy [integration/docroot@f021d3f]: {{Gerrit|Ia936ecd68e675cff2925dba933e3b67b9bad4cd6}} (duration: 00m 11s)
* 13:26 jforrester@deploy2002: Started deploy [integration/docroot@f021d3f]: {{Gerrit|Ia936ecd68e675cff2925dba933e3b67b9bad4cd6}}
* 13:24 kamila@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256384{{!}}Temporarily add shellbox-icu to $wgShellboxUrls (T419049 T419242 T419274)]] (duration: 07m 16s)
* 13:20 kamila@deploy2002: kamila: Continuing with sync
* 13:19 kamila@deploy2002: kamila: Backport for [[gerrit:1256384{{!}}Temporarily add shellbox-icu to $wgShellboxUrls (T419049 T419242 T419274)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:17 kamila@deploy2002: Started scap sync-world: Backport for [[gerrit:1256384{{!}}Temporarily add shellbox-icu to $wgShellboxUrls (T419049 T419242 T419274)]]
* 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1202: Pool db1202.eqiad.wmnet in after cloning
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 13:13 kamila@deploy2002: Finished scap sync-world: Backport for [[gerrit:1261420{{!}}cswiki: lift IP cap for editathon (T421305)]] (duration: 07m 22s)
* 13:12 btullis@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 13:09 kamila@deploy2002: kamila, anzx: Continuing with sync
* 13:08 jynus: deploying new grants for new ms-backup hosts and removing old ones [[phab:T420464|T420464]]
* 13:08 kamila@deploy2002: kamila, anzx: Backport for [[gerrit:1261420{{!}}cswiki: lift IP cap for editathon (T421305)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:06 kamila@deploy2002: Started scap sync-world: Backport for [[gerrit:1261420{{!}}cswiki: lift IP cap for editathon (T421305)]]
* 13:03 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:43 cdanis: puppet reenabled on drmrs, CIDERGRINDER deployed
* 12:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:23 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:12 cdanis: 💔cdanis@cumin1003.eqiad.wmnet ~ 🕗☕ sudo cumin 'A:cp-drmrs' 'disable-puppet "cdanis CIDER"'
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1004.eqiad.wmnet
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1006.eqiad.wmnet
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1003.eqiad.wmnet
* 12:02 elukey@cumin1003: END (PASS) - Cookbook sre.postgresql.postgres-init (exit_code=0)
* 12:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1002.eqiad.wmnet
* 12:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1005.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1006.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1005.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1004.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1003.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1002.eqiad.wmnet
* 11:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1001.eqiad.wmnet
* 11:44 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:43 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:43 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:41 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:41 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 11:41 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 11:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1001.eqiad.wmnet
* 11:38 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:37 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 11:33 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1202: Depool db1202.eqiad.wmnet to then clone it to db1253.eqiad.wmnet - fceratto@cumin1003
* 11:31 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 11:31 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1202: Depool db1202.eqiad.wmnet to then clone it to db1253.eqiad.wmnet - fceratto@cumin1003
* 11:31 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db1202.eqiad.wmnet onto db1253.eqiad.wmnet
* 11:31 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 11:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1001.eqiad.wmnet
* 11:22 elukey@cumin1003: START - Cookbook sre.postgresql.postgres-init
* 11:22 elukey@cumin1003: END (FAIL) - Cookbook sre.postgresql.postgres-init (exit_code=99)
* 11:22 elukey@cumin1003: START - Cookbook sre.postgresql.postgres-init
* 11:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1001.eqiad.wmnet
* 11:15 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
* 11:14 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
* 11:14 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 11:13 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 11:07 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 11:04 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260802{{!}}SI: Enable on bnwiki, itwiki, simplewiki, and plwiki (T415529)]] (duration: 09m 23s)
* 10:59 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 10:56 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1260802{{!}}SI: Enable on bnwiki, itwiki, simplewiki, and plwiki (T415529)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:54 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1260802{{!}}SI: Enable on bnwiki, itwiki, simplewiki, and plwiki (T415529)]]
* 10:33 oblivian@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: sync
* 10:32 oblivian@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: sync
* 10:32 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 10:32 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:23 tappof@cumin1003: END (PASS) - Cookbook sre.o11y.thanos-compact-restart (exit_code=0)
* 10:23 tappof@cumin1003: START - Cookbook sre.o11y.thanos-compact-restart
* 10:22 tappof@cumin1003: END (PASS) - Cookbook sre.o11y.thanos-compact-restart (exit_code=0)
* 10:22 tappof@cumin1003: START - Cookbook sre.o11y.thanos-compact-restart
* 10:12 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s1
* 10:11 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s1
* 10:05 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s4
* 10:05 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s4
* 09:58 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s8
* 09:58 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s8
* 09:53 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 09:53 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:52 hashar: Starting Gerrit on the replica / gerrit1003
* 09:51 hashar: Stopping Gerrit on the replica / gerrit1003 to clear web sessions
* 09:51 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s7
* 09:50 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s7
* 09:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 09:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 09:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 09:46 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 09:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 09:44 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s3
* 09:43 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s3
* 09:42 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 09:36 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s2
* 09:36 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s2
* 09:31 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:29 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s5
* 09:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:29 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s5
* 09:26 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 09:22 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s6
* 09:22 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 09:22 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s6
* 09:18 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:16 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section es6
* 09:15 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section es6
* 09:13 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 09:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 09:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 09:08 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section x3
* 09:07 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section x3
* 09:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:02 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section x1
* 09:01 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section x1
* 09:01 federico3: starting [[phab:T416708|T416708]] - disabling circular replication on core dbs
* 08:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 08:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 08:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 08:43 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:41 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:32 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 08:27 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 08:18 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 08:11 hashar@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 05:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Downgrade clouddb1022 to 10.11.13
== 2026-03-25 ==
* 23:59 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul2001.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 23:58 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul1001.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 23:29 mutante: zuul1001 - installed mariadb-client - connected once to zuul db on m1-master; mysql> truncate "alembic_version"; - systemctl restart zuul-web - This fixed the zuul-web service. finally no error in systemctl status. ([[phab:T405119|T405119]])
* 21:38 ryankemper: [opensearch-k8s] [[phab:T414484|T414484]] Depooled eqiad; change verified working (now when I do `host k8s-ingress-dse-aa.discovery.wmnet` from `cumin1003`, and then reverse-lookup the resulting IP, I get a codfw address); so traffic is now routing to dse-k8s-codfw
* 21:35 ryankemper@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 21:30 Dreamy_Jazz: Created cusi_case, cusi_user, and cusi_signal on bnwiki, itwiki, simplewiki, plwiki for [[phab:T415529|T415529]]
* 21:27 ryankemper: [opensearch-k8s] [[phab:T414484|T414484]] Getting ready to depool `dnsdisc=k8s-ingress-dse-aa,name=eqiad`, leaving codfw pooled. This will get us ready for a full rolling-upgrade of the dse-k8s-eqiad cluster tomorrow.
* 21:23 Dreamy_Jazz: Evening UTC backport window done
* 21:08 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260797{{!}}SuggestedInvestigations: Import session into signal matching job (T421062)]] (duration: 10m 26s)
* 21:04 kharlan@deploy2002: kharlan: Continuing with sync
* 21:01 kharlan@deploy2002: kharlan: Backport for [[gerrit:1260797{{!}}SuggestedInvestigations: Import session into signal matching job (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:58 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1260797{{!}}SuggestedInvestigations: Import session into signal matching job (T421062)]]
* 20:51 eevans@cumin1003: END (ERROR) - Cookbook sre.cassandra.roll-reboot (exit_code=97) rolling reboot on P<nowiki>{</nowiki>sessionstore[1004-1006].eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 20:43 aaron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259183{{!}}Add Analytics APIs to the RestSandbox (T419429)]] (duration: 08m 33s)
* 20:38 aaron@deploy2002: aaron: Continuing with sync
* 20:36 aaron@deploy2002: aaron: Backport for [[gerrit:1259183{{!}}Add Analytics APIs to the RestSandbox (T419429)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:34 aaron@deploy2002: Started scap sync-world: Backport for [[gerrit:1259183{{!}}Add Analytics APIs to the RestSandbox (T419429)]]
* 20:30 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247073{{!}}Deploy temporary accounts to ruwiki (T413771)]] (duration: 11m 04s)
* 20:25 jdlrobson@deploy2002: stran, jdlrobson: Continuing with sync
* 20:21 jdlrobson@deploy2002: stran, jdlrobson: Backport for [[gerrit:1247073{{!}}Deploy temporary accounts to ruwiki (T413771)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:19 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1247073{{!}}Deploy temporary accounts to ruwiki (T413771)]]
* 20:17 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255775{{!}}Close the legacy-vector dblist (T421289)]] (duration: 07m 42s)
* 20:14 aokoth@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:14 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:13 aokoth@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:13 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:12 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 20:12 aokoth@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:12 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1255775{{!}}Close the legacy-vector dblist (T421289)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255775{{!}}Close the legacy-vector dblist (T421289)]]
* 20:05 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling reboot on P<nowiki>{</nowiki>hcaptcha-proxy7002.wikimedia.org<nowiki>}</nowiki> and A:hcaptcha-proxy
* 20:01 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling reboot on P<nowiki>{</nowiki>hcaptcha-proxy7002.wikimedia.org<nowiki>}</nowiki> and A:hcaptcha-proxy
* 20:00 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>sessionstore[1004-1006].eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:34 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2003.codfw.wmnet
* 19:30 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host pybal-test2003.codfw.wmnet
* 19:26 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 19:24 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>sessionstore[2004-2006].codfw.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
* 19:17 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
* 19:17 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 19:14 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned reboot
* 19:11 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 19:11 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 19:07 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 19:00 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2012.codfw.wmnet
* 18:57 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2012.codfw.wmnet
* 18:53 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>sessionstore[2004-2006].codfw.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 18:51 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
* 18:51 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
* 18:50 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 18:50 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 18:49 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
* 18:49 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
* 18:49 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
* 18:48 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/toolhub: apply
* 18:48 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/termbox: apply
* 18:47 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/termbox: apply
* 18:47 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 18:47 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 18:47 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 18:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 18:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 18:46 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: Planned reboot
* 18:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 18:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 18:45 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 18:45 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 18:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 18:44 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
* 18:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
* 18:44 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/recommendation-api: apply
* 18:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/recommendation-api: apply
* 18:43 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
* 18:43 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 18:43 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/push-notifications: apply
* 18:42 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/push-notifications: apply
* 18:42 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 18:41 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
* 18:41 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:40 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
* 18:40 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
* 18:39 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 18:39 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
* 18:37 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 18:37 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 18:37 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:37 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 18:35 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 18:34 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:34 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:33 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
* 18:29 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:28 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
* 18:28 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 18:26 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 18:26 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
* 18:26 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
* 18:25 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: debug java install
* 18:25 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:25 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases1003.eqiad.wmnet with reason: debug java install
* 18:25 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/image-suggestion: apply
* 18:24 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/image-suggestion: apply
* 18:23 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
* 18:23 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
* 18:23 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
* 18:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
* 18:22 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 18:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 18:21 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 18:21 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 18:20 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 18:20 mutante: releases1003 - apt-get upgrade - envoyproxy, python3-wmflib
* 18:20 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 18:20 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:19 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:19 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 18:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 18:18 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
* 18:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
* 18:17 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
* 18:17 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
* 18:16 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 18:16 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/echostore: apply
* 18:15 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
* 18:15 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
* 18:15 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 18:14 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 18:14 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
* 18:14 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/data-gateway: apply
* 18:14 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 18:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 18:13 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/commons-impact-analytics: apply
* 18:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/commons-impact-analytics: apply
* 18:12 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 18:12 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
* 18:11 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 18:11 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
* 18:11 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
* 18:11 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
* 18:09 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/apertium: apply
* 18:09 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/apertium: apply
* 17:29 sfaci@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:29 sfaci@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 17:23 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: debug java install
* 17:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 16:44 ebysans@deploy2002: Finished deploy [analytics/refinery@80c527b] (thin): Regular analytics weekly train THIN [analytics/refinery@80c527b6] (duration: 01m 59s)
* 16:42 ebysans@deploy2002: Started deploy [analytics/refinery@80c527b] (thin): Regular analytics weekly train THIN [analytics/refinery@80c527b6]
* 16:42 SandraEbele_: Deploying Refinery as part of weekly deployment train
* 16:41 ebysans@deploy2002: Finished deploy [analytics/refinery@80c527b]: Regular analytics weekly train [analytics/refinery@80c527b6] (duration: 04m 32s)
* 16:37 ebysans@deploy2002: Started deploy [analytics/refinery@80c527b]: Regular analytics weekly train [analytics/refinery@80c527b6]
* 16:22 ebysans@deploy2002: Finished deploy [analytics/refinery@80c527b] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@80c527b6] (duration: 01m 58s)
* 16:22 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:21 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:21 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:20 ebysans@deploy2002: Started deploy [analytics/refinery@80c527b] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@80c527b6]
* 16:20 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 16:19 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 16:18 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:18 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:06 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 16:05 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 16:05 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 16:04 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 16:03 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:02 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 16:02 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 16:01 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:51 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:50 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:42 blake@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244628{{!}}debug: reorder debug backends for eqiad switchover (T413974)]] (duration: 07m 41s)
* 15:41 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:37 blake@deploy2002: blake: Continuing with sync
* 15:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:37 blake@deploy2002: blake: Backport for [[gerrit:1244628{{!}}debug: reorder debug backends for eqiad switchover (T413974)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:34 blake@deploy2002: Started scap sync-world: Backport for [[gerrit:1244628{{!}}debug: reorder debug backends for eqiad switchover (T413974)]]
* 15:34 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:32 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-unlock-scap (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:32 root@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Datacenter switchover from codfw to eqiad - (duration: 91m 45s)
* 15:32 root@deploy2002: Forcefully removing global lock: Datacenter switchover from codfw to eqiad -
* 15:32 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-unlock-scap for datacenter switchover from codfw to eqiad
* 15:31 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:26 blake@dns1004: END - running authdns-update
* 15:24 blake@dns1004: START - running authdns-update
* 15:24 elukey@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 15:23 elukey@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 15:18 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters for datacenter switchover from codfw to eqiad
* 15:18 blake@dns1004: END - running authdns-update
* 15:16 blake@dns1004: START - running authdns-update
* 15:14 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-restore-ttl (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:13 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-restore-ttl for datacenter switchover from codfw to eqiad
* 15:11 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:10 root@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 15:09 root@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 15:08 root@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 15:07 root@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 15:07 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance for datacenter switchover from codfw to eqiad
* 15:07 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:07 root@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: sync
* 15:07 root@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: sync
* 15:07 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner for datacenter switchover from codfw to eqiad
* 15:02 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:02 blake@cumin1003: MediaWiki read-only period ends at: 2026-03-25 15:02:52.921926
* 14:55 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:53 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance for datacenter switchover from codfw to eqiad
* 14:52 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:52 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks for datacenter switchover from codfw to eqiad
* 14:51 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:46 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl for datacenter switchover from codfw to eqiad
* 14:28 cdanis: 💙cdanis@apt1002.wikimedia.org ~ 🕥☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
* 14:28 cdanis: 💙cdanis@apt1002.wikimedia.org ~ 🕥☕ sudo -i reprepro --component main --restrict cidergrinder update bullseye-wikimedia
* 14:24 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['phab2002']
* 14:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:17 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['phab2002']
* 14:14 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:13 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:13 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:12 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:11 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:08 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:07 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:07 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:06 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:06 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:06 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:05 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-lock-scap (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:00 root@deploy2002: Locking from deployment [ALL REPOSITORIES]: Datacenter switchover from codfw to eqiad -
* 14:00 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-lock-scap for datacenter switchover from codfw to eqiad
* 13:49 otto@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260091{{!}}EventStreamConfig - Increase spark_job_ingestion_scale for larger event streams (T360794 T351225)]] (duration: 07m 48s)
* 13:45 otto@deploy2002: otto: Continuing with sync
* 13:45 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:44 otto@deploy2002: otto: Backport for [[gerrit:1260091{{!}}EventStreamConfig - Increase spark_job_ingestion_scale for larger event streams (T360794 T351225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 otto@deploy2002: Started scap sync-world: Backport for [[gerrit:1260091{{!}}EventStreamConfig - Increase spark_job_ingestion_scale for larger event streams (T360794 T351225)]]
* 13:32 awight@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260614{{!}}[beta] Kill synthetic refs with feature flag (T421055)]], [[gerrit:1251193{{!}}idwiki: Remove unused user groups on Indonesian Wikipedia (T419105)]], [[gerrit:1251200{{!}}ptwiki: Enable block action for the abuse filter (T419312)]], [[gerrit:1256748{{!}}ptwiki: Add suppressredirect to autoreviewer and rollbacker user groups (T420704)]] (duration: 11m 33s)
* 13:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:27 awight@deploy2002: codenamenoreste, awight, gerrit-patch-uploader: Continuing with sync
* {{safesubst:SAL entry|1=13:23 awight@deploy2002: codenamenoreste, awight, gerrit-patch-uploader: Backport for [[gerrit:1260614{{!}}[beta] Kill synthetic refs with feature flag (T421055)]], [[gerrit:1251193{{!}}idwiki: Remove unused user groups on Indonesian Wikipedia (T419105)]], [[gerrit:1251200{{!}}ptwiki: Enable block action for the abuse filter (T419312)]], [[gerrit:1256748{{!}}ptwiki: Add suppressredirect to autoreviewer and rollbacker user groups (T420704)]}}
* 13:20 awight@deploy2002: Started scap sync-world: Backport for [[gerrit:1260614{{!}}[beta] Kill synthetic refs with feature flag (T421055)]], [[gerrit:1251193{{!}}idwiki: Remove unused user groups on Indonesian Wikipedia (T419105)]], [[gerrit:1251200{{!}}ptwiki: Enable block action for the abuse filter (T419312)]], [[gerrit:1256748{{!}}ptwiki: Add suppressredirect to autoreviewer and rollbacker user groups (T420704)]]
* 13:17 dcausse@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260045{{!}}Revert^2 "search: use the discovery ns record for the semanticsearch cluster"]] (duration: 10m 20s)
* 13:12 dcausse@deploy2002: dcausse: Continuing with sync
* 13:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:09 dcausse@deploy2002: dcausse: Backport for [[gerrit:1260045{{!}}Revert^2 "search: use the discovery ns record for the semanticsearch cluster"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:07 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:06 dcausse@deploy2002: Started scap sync-world: Backport for [[gerrit:1260045{{!}}Revert^2 "search: use the discovery ns record for the semanticsearch cluster"]]
* 13:06 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:02 XioNoX: Inter.Link - DDoS - Activation of automatic reroute
* 12:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:51 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 12:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Downgrade clouddb1022 to 10.11.15
* 12:41 marostegui@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 12:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1002.eqiad.wmnet
* 12:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-test-coord1002.eqiad.wmnet
* 12:38 mszwarc@deploy2002: mwscript-k8s job started: foreachwikiindblist all demoteIneligibleUsers.php --relay-log checkuser=metawiki --relay-log suppress=metawiki # [[phab:T418580|T418580]]
* 12:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-test-coord1002.eqiad.wmnet
* 12:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-mariadb1002.eqiad.wmnet
* 12:33 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:32 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1028.eqiad.wmnet
* 12:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host wdqs1028.eqiad.wmnet
* 12:24 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:19 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260617{{!}}Allow for demoting 2FA-less members of further 6 groups (T418580)]] (duration: 10m 23s)
* 12:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2009.codfw.wmnet
* 12:12 mszwarc@deploy2002: mszwarc: Continuing with sync
* 12:11 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1260617{{!}}Allow for demoting 2FA-less members of further 6 groups (T418580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:09 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1260617{{!}}Allow for demoting 2FA-less members of further 6 groups (T418580)]]
* 12:07 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host wdqs2009.codfw.wmnet
* 12:07 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl2002.codfw.wmnet
* 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl2002.codfw.wmnet
* 11:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl2001.codfw.wmnet
* 11:53 marostegui: Restart clouddb1022:s3 to enable error_log [[phab:T420177|T420177]]
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl2001.codfw.wmnet
* 11:51 jayme: migrated wikikube apiservers (eqiad and codfw) to IPIP - [[phab:T420436|T420436]]
* 11:49 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-master-codfw@codfw
* 11:49 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 11:48 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 11:46 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-master-eqiad@eqiad
* 11:46 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 11:45 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 11:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:43 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-master-codfw@codfw
* 11:41 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-master-eqiad@eqiad
* 11:40 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:38 mvernon@deploy2002: helmfile [codfw] DONE helmfile.d/services/kartotherian: apply
* 11:36 mvernon@deploy2002: helmfile [codfw] START helmfile.d/services/kartotherian: apply
* 11:21 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:18 mvernon@deploy2002: helmfile [codfw] DONE helmfile.d/services/kartotherian: apply
* 11:16 mvernon@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 11:15 mvernon@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 11:15 mvernon@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 11:14 mvernon@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 11:07 mvernon@deploy2002: helmfile [codfw] START helmfile.d/services/kartotherian: apply
* 11:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis abstractwiki in section s5
* 11:07 mvernon@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: apply
* 11:05 mvernon@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: apply
* 10:55 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:53 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis abstractwiki in section s5
* 10:45 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 10:33 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:27 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:26 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:21 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:20 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:20 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:01 elukey@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=kartotherian,name=codfw
* 09:58 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:57 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:52 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:52 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:51 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:51 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:46 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:45 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:45 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:44 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:05 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=aux-k8s-worker200[2-5].codfw.wmnet,cluster=aux-k8s,service=kubesvc
* 09:04 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=aux-k8s-worker200[6-9].codfw.wmnet,cluster=aux-k8s,service=kubesvc
* 09:04 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=aux-k8s-worker100[6-9].eqiad.wmnet,cluster=aux-k8s,service=kubesvc
* 08:55 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker200[6-9].eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:55 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker200[6-9].eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:35 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1009.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:35 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1008.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1007.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1006.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1009.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1008.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1007.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1006.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device fasw2-c8b-codfw
* 08:29 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device fasw2-c8b-codfw
* 08:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device fasw2-c8a-codfw
* 08:29 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device fasw2-c8a-codfw
* 08:10 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 00:33 rzl@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 00:23 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 00:23 rzl@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 00:23 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 00:22 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 00:22 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 00:22 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 00:21 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
* 00:21 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
* 00:21 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/toolhub: apply
* 00:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/termbox: apply
* 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/termbox: apply
* 00:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
* 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
* 00:19 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 00:19 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 00:18 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 00:18 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 00:18 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 00:18 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 00:17 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 00:17 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 00:17 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 00:17 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 00:17 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 00:16 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
* 00:16 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 00:16 rzl@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 00:16 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/recommendation-api: apply
* 00:15 rzl@deploy2002: helmfile [staging] START helmfile.d/services/recommendation-api: apply
* 00:15 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
* 00:15 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
* 00:15 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1023.eqiad.wmnet with OS bookworm
* 00:15 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/push-notifications: apply
* 00:15 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:15 rzl@deploy2002: helmfile [staging] START helmfile.d/services/push-notifications: apply
* 00:15 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
* 00:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:14 rzl@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
* 00:13 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
* 00:13 rzl@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
* 00:13 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 00:13 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 00:13 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 00:12 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 00:12 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 00:11 rzl@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
* 00:10 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 00:10 rzl@deploy2002: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 00:09 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
* 00:07 rzl@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
* 00:07 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 00:06 rzl@deploy2002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 00:06 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
* 00:06 rzl@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: apply
* 00:06 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
* 00:06 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1022.eqiad.wmnet with OS bookworm
* 00:06 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:06 rzl@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
* 00:06 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/image-suggestion: apply
* 00:06 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/image-suggestion: apply
* 00:05 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
* 00:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/geo-analytics: apply
* 00:05 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 00:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 00:04 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 00:04 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 00:04 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 00:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1021.eqiad.wmnet with OS bookworm
* 00:04 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:04 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 00:04 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 00:04 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:03 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 00:03 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 00:03 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 00:02 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 00:02 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 00:02 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
* 00:01 rzl@deploy2002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
* 00:01 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
* 00:01 rzl@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
* 00:00 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 00:00 rzl@deploy2002: helmfile [staging] START helmfile.d/services/echostore: apply
* 00:00 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
* 00:00 rzl@deploy2002: helmfile [staging] START helmfile.d/services/device-analytics: apply
== 2026-03-24 ==
* 23:59 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 23:59 rzl@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 23:59 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 23:59 rzl@deploy2002: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 23:58 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 23:58 rzl@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 23:58 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/commons-impact-analytics: apply
* 23:58 rzl@deploy2002: helmfile [staging] START helmfile.d/services/commons-impact-analytics: apply
* 23:56 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 23:56 rzl@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
* 23:56 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 23:56 rzl@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
* 23:54 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
* 23:53 rzl@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
* 23:53 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1023.eqiad.wmnet with reason: host reimage
* 23:53 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/apertium: apply
* 23:52 rzl@deploy2002: helmfile [staging] START helmfile.d/services/apertium: apply
* 23:46 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1022.eqiad.wmnet with reason: host reimage
* 23:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1021.eqiad.wmnet with reason: host reimage
* 23:19 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1023.eqiad.wmnet with OS bookworm
* 23:19 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1022.eqiad.wmnet with OS bookworm
* 23:18 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1021.eqiad.wmnet with OS bookworm
* 23:16 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
* 23:16 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
* 23:15 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
* 23:15 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
* 22:03 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260118{{!}}Drop inactive simple summary surveys (T389393)]] (duration: 08m 15s)
* 21:57 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:57 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1260118{{!}}Drop inactive simple summary surveys (T389393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1260118{{!}}Drop inactive simple summary surveys (T389393)]]
* 21:52 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259147{{!}}Address FIXME and drop not selector for section headings (T420085)]] (duration: 13m 11s)
* 21:47 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:44 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1259147{{!}}Address FIXME and drop not selector for section headings (T420085)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:38 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1259147{{!}}Address FIXME and drop not selector for section headings (T420085)]]
* 21:00 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=frwiki --source-pseudo-namespace=Abstract_ --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:55 jforrester@deploy2002: mwscript-k8s job started: moveBatch --wiki=frwiki '--u=Jdforrester (WMF)' --r=[[phab:T420654|T420654]] --noredirects /home/jforrester/T420654-frwiki-move # [[phab:T420654|T420654]] abstract: is now an interwiki; manual fix
* 20:55 jforrester@deploy2002: mwscript-k8s job started: moveBatch '--u=Jdforrester (WMF)' --r=[[phab:T420654|T420654]] --noredirects /home/jforrester/T420654-frwiki-move # [[phab:T420654|T420654]] abstract: is now an interwiki; manual fix
* 20:47 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=ptwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:46 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=idwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:46 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=frwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:45 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=eswiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:39 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=enwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:39 jforrester@deploy2002: mwscript-k8s job started: sql extensions/WikimediaMaintenance/maintenance/namespaceDupes.php --wiki=enwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:38 jforrester@deploy2002: mwscript-k8s job started: sql maintenance/namespaceDupes.php --wiki=enwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:38 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256433{{!}}[wikifunctions] Drop m.wikifunctions.org from lists, we've not used it for years]], [[gerrit:1250114{{!}}Move GrowthExperiments REST API definition to IS]], [[gerrit:1259993{{!}}dumpInterwiki: Re-generate to add Abstract Wikipedia (and others) (T420654)]] (duration: 07m 46s)
* 20:33 jforrester@deploy2002: jforrester: Continuing with sync
* 20:32 jforrester@deploy2002: jforrester: Backport for [[gerrit:1256433{{!}}[wikifunctions] Drop m.wikifunctions.org from lists, we've not used it for years]], [[gerrit:1250114{{!}}Move GrowthExperiments REST API definition to IS]], [[gerrit:1259993{{!}}dumpInterwiki: Re-generate to add Abstract Wikipedia (and others) (T420654)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified the
* 20:30 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1256433{{!}}[wikifunctions] Drop m.wikifunctions.org from lists, we've not used it for years]], [[gerrit:1250114{{!}}Move GrowthExperiments REST API definition to IS]], [[gerrit:1259993{{!}}dumpInterwiki: Re-generate to add Abstract Wikipedia (and others) (T420654)]]
* {{safesubst:SAL entry|1=20:27 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259967{{!}}Set json object before setting Abstract Wiki Id (T420916)]], [[gerrit:1259994{{!}}AbstractPreview: apply selected preview language lang/dir to abstract preview body (T420687)]], [[gerrit:1260092{{!}}AbstractTitle: Handle pageinfo responses without normalized titles (T420725)]], [[gerrit:1259992{{!}}[abstractwiki] Don't list abstract as a langlist entry}}
* 20:22 jforrester@deploy2002: jforrester: Continuing with sync
* 20:22 jforrester@deploy2002: jforrester: Backport for [[gerrit:1259967{{!}}Set json object before setting Abstract Wiki Id (T420916)]], [[gerrit:1259994{{!}}AbstractPreview: apply selected preview language lang/dir to abstract preview body (T420687)]], [[gerrit:1260092{{!}}AbstractTitle: Handle pageinfo responses without normalized titles (T420725)]], [[gerrit:1259992{{!}}[abstractwiki] Don't list abstract as a langlist entry (T420654)]] s
* {{safesubst:SAL entry|1=20:20 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1259967{{!}}Set json object before setting Abstract Wiki Id (T420916)]], [[gerrit:1259994{{!}}AbstractPreview: apply selected preview language lang/dir to abstract preview body (T420687)]], [[gerrit:1260092{{!}}AbstractTitle: Handle pageinfo responses without normalized titles (T420725)]], [[gerrit:1259992{{!}}[abstractwiki] Don't list abstract as a langlist entry}}
* 20:12 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1242542{{!}}Generate our own logo thumbnails rather than using MediaWiki's (T414048)]], [[gerrit:1250095{{!}}Enwikinews: Only enable flaggedRevs in article namespace (T418066)]], [[gerrit:1252684{{!}}Disable magic links on afwiki (T420142)]] (duration: 09m 22s)
* 20:08 jforrester@deploy2002: jforrester, pppery: Continuing with sync
* 20:05 jforrester@deploy2002: jforrester, pppery: Backport for [[gerrit:1242542{{!}}Generate our own logo thumbnails rather than using MediaWiki's (T414048)]], [[gerrit:1250095{{!}}Enwikinews: Only enable flaggedRevs in article namespace (T418066)]], [[gerrit:1252684{{!}}Disable magic links on afwiki (T420142)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:03 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1242542{{!}}Generate our own logo thumbnails rather than using MediaWiki's (T414048)]], [[gerrit:1250095{{!}}Enwikinews: Only enable flaggedRevs in article namespace (T418066)]], [[gerrit:1252684{{!}}Disable magic links on afwiki (T420142)]]
* 19:42 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:42 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:41 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:39 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1258300{{!}}tests: Make many things static for PHPUnit 10 (T420844)]], [[gerrit:1258301{{!}}phpunit.xml: Update configuration for PHPUnit 10 (T420844)]] (duration: 07m 21s)
* 19:35 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:35 reedy@deploy2002: reedy: Continuing with sync
* 19:34 reedy@deploy2002: reedy: Backport for [[gerrit:1258300{{!}}tests: Make many things static for PHPUnit 10 (T420844)]], [[gerrit:1258301{{!}}phpunit.xml: Update configuration for PHPUnit 10 (T420844)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:32 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1258300{{!}}tests: Make many things static for PHPUnit 10 (T420844)]], [[gerrit:1258301{{!}}phpunit.xml: Update configuration for PHPUnit 10 (T420844)]]
* 19:02 inflatador: bking@apt1002 `sudo -E reprepro -C component/opensearch2 include trixie-wikimedia ~/wmf-opensearch-search-plugins-2.19.5+3-trixie/wmf-opensearch-search-plugins_2.19.5+3_amd64.changes`
* 18:48 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 18:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1170: Degraded drive replaced [[phab:T420873|T420873]]
* 18:43 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 18:36 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 18:35 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 18:25 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:24 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:20 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 18:20 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 18:13 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 18:11 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 18:07 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on phab2002.codfw.wmnet with reason: [[phab:T420228|T420228]]
* 18:01 aokoth@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 18:01 aokoth@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 18:01 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:00 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:00 mutante: codesearch9.codesearch - systemctl restart hound_proxy ([[phab:T421147|T421147]])
* 17:34 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 17:30 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 17:20 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 17:20 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 17:20 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 17:20 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 17:00 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:00 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:00 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:00 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 16:47 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 16:38 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1113.*
* 16:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1170: Degraded drive replaced [[phab:T420873|T420873]]
* 16:24 sfaci@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:24 sfaci@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:22 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1113.eqiad.wmnet with OS trixie
* 16:05 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 16:04 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 16:03 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 16:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 16:03 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 16:03 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 16:03 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 16:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1113.eqiad.wmnet with reason: host reimage
* 15:54 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1113.eqiad.wmnet with reason: host reimage
* 15:54 bjensen: Services portion of the datacenter switchover is complete
* 15:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2009.codfw.wmnet with OS trixie
* 15:46 blake@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) depool all services in codfw: Datacenter Switchover - [[phab:T413974|T413974]]
* 15:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:38 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1113.eqiad.wmnet with OS trixie
* 15:38 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1113.eqiad.wmnet with OS trixie
* 15:36 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2009.codfw.wmnet with reason: host reimage
* 15:30 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2009.codfw.wmnet with reason: host reimage
* 15:20 blake@cumin1003: START - Cookbook sre.discovery.datacenter depool all services in codfw: Datacenter Switchover - [[phab:T413974|T413974]]
* 15:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1113.eqiad.wmnet with OS trixie
* 15:18 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2009.codfw.wmnet with OS trixie
* 15:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2008.codfw.wmnet with OS trixie
* 14:59 blake@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool codfw [reason: no reason specified, no task ID specified]
* 14:59 blake@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool codfw [reason: no reason specified, no task ID specified]
* 14:59 bjensen: beginning the Traffic and Services portions of the DC switchover, operational followup will be in #wikimedia-sre
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2008.codfw.wmnet with reason: host reimage
* 14:56 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2008.codfw.wmnet with reason: host reimage
* 14:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1009.eqiad.wmnet with OS trixie
* 14:44 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2008.codfw.wmnet with OS trixie
* 14:42 aokoth@dns1004: END - running authdns-update
* 14:41 aokoth@dns1004: START - running authdns-update
* 14:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1009.eqiad.wmnet with reason: host reimage
* 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2009.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:27 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1009.eqiad.wmnet with reason: host reimage
* 14:26 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2009.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:25 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:23 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:20 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2008.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:19 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:19 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:16 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1009.eqiad.wmnet with OS trixie
* 14:15 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2008.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:14 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 14:13 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 14:13 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 14:13 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 14:12 dcausse@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259979{{!}}Revert "search: use the discovery ns record for the semanticsearch cluster"]] (duration: 06m 54s)
* 14:08 dcausse@deploy2002: dcausse: Continuing with sync
* 14:07 dcausse@deploy2002: dcausse: Backport for [[gerrit:1259979{{!}}Revert "search: use the discovery ns record for the semanticsearch cluster"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2007.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:05 dcausse@deploy2002: Started scap sync-world: Backport for [[gerrit:1259979{{!}}Revert "search: use the discovery ns record for the semanticsearch cluster"]]
* 14:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1008.eqiad.wmnet with OS trixie
* 14:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2007.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:59 jforrester@deploy2002: mwscript-k8s job started: sql --wiki=abstractwiki /srv/mediawiki/php-1.46.0-wmf.20/extensions/Translate/sql/mysql/translate_message_group_subscriptions.sql # [[phab:T420656|T420656]] translate_message_group_subscriptions
* 13:59 dcausse@deploy2002: Sync cancelled.
* 13:57 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:52 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1008.eqiad.wmnet with reason: host reimage
* 13:46 dcausse@deploy2002: dcausse: Backport for [[gerrit:1259875{{!}}search: use the discovery ns record for the semanticsearch cluster (T414484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:44 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1008.eqiad.wmnet with reason: host reimage
* 13:44 dcausse@deploy2002: Started scap sync-world: Backport for [[gerrit:1259875{{!}}search: use the discovery ns record for the semanticsearch cluster (T414484)]]
* 13:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1008.eqiad.wmnet with OS trixie
* 13:32 sukhe: sudo cumin -b1 -s20 'C:bird' "run-puppet-agent --enable 'merging CR {{Gerrit|1248385}}, [[phab:T413740|T413740]]'"
* 13:30 cmelo@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259231{{!}}Enable the CampaignEvents extension on all wikibooks (T419597)]], [[gerrit:1259237{{!}}Enable $wgCampaignEventsEnableEventGoals in prod wikis (T414149)]] (duration: 12m 43s)
* 13:26 cmelo@deploy2002: cmelo, daimona: Continuing with sync
* 13:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1007.eqiad.wmnet with OS trixie
* 13:23 sukhe: sudo cumin 'C:bird' "disable-puppet 'merging CR {{Gerrit|1248385}}, [[phab:T413740|T413740]]'"
* 13:20 cmelo@deploy2002: cmelo, daimona: Backport for [[gerrit:1259231{{!}}Enable the CampaignEvents extension on all wikibooks (T419597)]], [[gerrit:1259237{{!}}Enable $wgCampaignEventsEnableEventGoals in prod wikis (T414149)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:18 cmelo@deploy2002: Started scap sync-world: Backport for [[gerrit:1259231{{!}}Enable the CampaignEvents extension on all wikibooks (T419597)]], [[gerrit:1259237{{!}}Enable $wgCampaignEventsEnableEventGoals in prod wikis (T414149)]]
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 13:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) payments1012.frack.eqiad.wmnet on all recursors
* 13:04 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache payments1012.frack.eqiad.wmnet on all recursors
* 13:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) payments1011.frack.eqiad.wmnet on all recursors
* 13:03 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache payments1011.frack.eqiad.wmnet on all recursors
* 13:03 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) payments1010.frack.eqiad.wmnet on all recursors
* 13:03 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache payments1010.frack.eqiad.wmnet on all recursors
* 13:02 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 13:00 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:00 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify records for payments servers frack - cmooney@cumin1003"
* 13:00 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify records for payments servers frack - cmooney@cumin1003"
* 12:56 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 12:50 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1007.eqiad.wmnet with OS trixie
* 12:02 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1017.eqiad.wmnet
* 12:02 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1017.eqiad.wmnet
* 12:01 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s1
* 11:53 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:53 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:51 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017 [[phab:T419960|T419960]]
* 11:51 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1017.eqiad.wmnet
* 11:51 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1017.eqiad.wmnet
* 11:51 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s1
* 11:49 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:49 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:46 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:46 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 11:36 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin1001.eqiad.wmnet
* 11:32 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet,service=x3
* 11:32 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet,service=x3
* 11:32 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcumin1001.eqiad.wmnet
* 11:31 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1023.eqiad.wmnet,service=x3
* 11:31 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=x3
* 11:31 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1023.eqiad.wmnet,service=x3
* 11:31 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1022.eqiad.wmnet,service=x3
* 11:27 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s3
* 11:27 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:27 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:26 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1006.eqiad.wmnet with reason: host reimage
* 11:22 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:19 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:19 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:18 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1006.eqiad.wmnet with reason: host reimage
* 11:18 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s3
* 11:17 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s3
* 11:14 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:14 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:07 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 10:55 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:55 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:53 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2007.codfw.wmnet with OS trixie
* 10:49 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2006.codfw.wmnet with OS trixie
* 10:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:36 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2007.codfw.wmnet with reason: host reimage
* 10:33 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2006.codfw.wmnet with reason: host reimage
* 10:30 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2007.codfw.wmnet with reason: host reimage
* 10:28 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2006.codfw.wmnet with reason: host reimage
* 10:22 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:21 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:20 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:20 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:17 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 10:17 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2007.codfw.wmnet with OS trixie
* 10:16 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2006.codfw.wmnet with OS trixie
* 10:07 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 09:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:46 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:43 ayounsi@cumin1003: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:34 ayounsi@cumin1003: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:31 ayounsi@cumin1003: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:31 ayounsi@cumin1003: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:29 ayounsi@cumin1003: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:29 ayounsi@cumin1003: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:23 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4008.ulsfo.wmnet with OS bookworm
* 09:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 09:01 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 08:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 08:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old ulsfo ganeti VIP - ayounsi@cumin1003"
* 08:50 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old ulsfo ganeti VIP - ayounsi@cumin1003"
* 08:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:46 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1170: Degraded drive [[phab:T420873|T420873]]
* 08:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:45 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1170: Degraded drive [[phab:T420873|T420873]]
* 08:43 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:39 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti4008.ulsfo.wmnet with OS bookworm
* 08:31 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 08:31 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:27 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:27 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:25 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:24 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:13 hashar@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 07:59 hashar: Changed https://logstash.wikimedia.org/ default page back to /app/dashboards
* 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.18 (duration: 01m 13s)
* 03:42 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.21 refs [[phab:T420479|T420479]] (duration: 39m 27s)
* 03:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 02:46 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 04s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:56 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 01:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1104.*
* 01:37 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1104.eqiad.wmnet with OS trixie
* 01:14 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1104.eqiad.wmnet with reason: host reimage
* 01:08 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1104.eqiad.wmnet with reason: host reimage
* 00:52 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1104.eqiad.wmnet with OS trixie
* 00:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 00:18 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1115.eqiad.wmnet with OS trixie
== 2026-03-23 ==
* 22:51 rzl: root@apt1002:~# reprepro --noskipold --restrict vopsbot update bookworm-wikimedia
* 22:44 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 22:28 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host an-worker1172.eqiad.wmnet
* 22:25 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1104.eqiad.wmnet with OS trixie
* 22:07 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 22:05 maryum: Deployed security fix for [[phab:T415584|T415584]]
* 21:53 maryum: Deployed security fix for [[phab:T419192|T419192]]
* 21:41 maryum: Deployed security fix for [[phab:T419168|T419168]]
* 21:35 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 21:25 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255847{{!}}testwiki: Add temporary groups for security testing]] (duration: 12m 33s)
* 21:22 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 21:21 catrope@deploy2002: catrope: Continuing with sync
* 21:18 catrope@deploy2002: catrope: Backport for [[gerrit:1255847{{!}}testwiki: Add temporary groups for security testing]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:12 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1255847{{!}}testwiki: Add temporary groups for security testing]]
* 21:05 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1106.eqiad.wmnet [reason: trixie reimaging]
* 21:05 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1106.eqiad.wmnet [reason: trixie reimaging]
* 21:05 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1104.eqiad.wmnet with OS trixie
* 21:04 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1104.eqiad.wmnet [reason: trixie reimaging]
* 21:03 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1103.eqiad.wmnet [reason: trixie reimaging]
* 20:58 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256394{{!}}Abstract Wikipedia: Fix API call to get page info (T420725)]], [[gerrit:1259085{{!}}[abstractwiki] Enable the Translate extension (T420656)]], [[gerrit:1250113{{!}}Move testwiki-only Attribution REST API definition to IS]] (duration: 11m 12s)
* 20:54 jforrester@deploy2002: jforrester: Continuing with sync
* 20:53 jforrester@deploy2002: jforrester: Backport for [[gerrit:1256394{{!}}Abstract Wikipedia: Fix API call to get page info (T420725)]], [[gerrit:1259085{{!}}[abstractwiki] Enable the Translate extension (T420656)]], [[gerrit:1250113{{!}}Move testwiki-only Attribution REST API definition to IS]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1103.eqiad.wmnet with OS trixie
* 20:50 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy4002.wikimedia.org
* 20:50 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:50 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:47 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1256394{{!}}Abstract Wikipedia: Fix API call to get page info (T420725)]], [[gerrit:1259085{{!}}[abstractwiki] Enable the Translate extension (T420656)]], [[gerrit:1250113{{!}}Move testwiki-only Attribution REST API definition to IS]]
* 20:46 sukhe@cumin1003: START - Cookbook sre.dns.netbox
* 20:45 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 20:43 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1102.eqiad.wmnet [reason: trixie reimaging]
* {{safesubst:SAL entry|1=20:42 dani@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254448{{!}}Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450{{!}}Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452{{!}}Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763{{!}}testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120{{!}}Enable wgCampaignEventsEnableEventGoals in}}
* 20:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1102.eqiad.wmnet with OS trixie
* 20:41 sukhe@cumin1003: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy4002.wikimedia.org
* 20:40 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy4001.wikimedia.org
* 20:40 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:40 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:39 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:37 dani@deploy2002: milimetric, daimona, dani: Continuing with sync
* {{safesubst:SAL entry|1=20:36 dani@deploy2002: milimetric, daimona, dani: Backport for [[gerrit:1254448{{!}}Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450{{!}}Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452{{!}}Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763{{!}}testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120{{!}}Enable wgCampaignEventsEnableEventGoals i}}
* 20:35 sukhe@cumin1003: START - Cookbook sre.dns.netbox
* {{safesubst:SAL entry|1=20:34 dani@deploy2002: Started scap sync-world: Backport for [[gerrit:1254448{{!}}Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450{{!}}Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452{{!}}Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763{{!}}testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120{{!}}Enable wgCampaignEventsEnableEventGoals in}}
* 20:31 sukhe@cumin1003: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy4001.wikimedia.org
* 20:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1103.eqiad.wmnet with reason: host reimage
* 20:24 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1103.eqiad.wmnet with reason: host reimage
* 20:23 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 20:19 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1102.eqiad.wmnet with reason: host reimage
* 20:17 alexsanford@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256472{{!}}Reduce reauth timeout for editing site JS to 10 minutes (T419605)]] (duration: 07m 32s)
* 20:14 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1102.eqiad.wmnet with reason: host reimage
* 20:13 alexsanford@deploy2002: alexsanford: Continuing with sync
* 20:11 alexsanford@deploy2002: alexsanford: Backport for [[gerrit:1256472{{!}}Reduce reauth timeout for editing site JS to 10 minutes (T419605)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 alexsanford@deploy2002: Started scap sync-world: Backport for [[gerrit:1256472{{!}}Reduce reauth timeout for editing site JS to 10 minutes (T419605)]]
* 20:08 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1103.eqiad.wmnet with OS trixie
* 20:07 alexsanford: Deployed mitigation for [[phab:T419605|T419605]]
* 19:58 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 19:58 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 19:58 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 19:58 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1102.eqiad.wmnet with OS trixie
* 19:57 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 19:54 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 19:54 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 19:54 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 19:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy4004.wikimedia.org
* 19:51 cdobbins@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1102.eqiad.wmnet with OS trixie
* 19:50 cdobbins@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1103.eqiad.wmnet with OS trixie
* 19:50 ayounsi@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy4004.wikimedia.org
* 19:47 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 19:47 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy4003.wikimedia.org
* 19:46 ayounsi@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy4003.wikimedia.org
* 19:44 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 19:44 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>aqs[1011,1014,1016-1022]*<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:42 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1103.eqiad.wmnet with OS trixie
* 19:42 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1103.eqiad.wmnet [reason: trixie reimaging]
* 19:41 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1101.eqiad.wmnet [reason: trixie reimaging]
* 19:41 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1102.eqiad.wmnet with OS trixie
* 19:41 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1102.eqiad.wmnet [reason: trixie reimaging]
* 19:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1101.eqiad.wmnet with OS trixie
* 19:39 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1100.eqiad.wmnet [reason: trixie reimaging]
* 19:37 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1100.eqiad.wmnet with OS trixie
* 19:30 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 19:18 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1101.eqiad.wmnet with reason: host reimage
* 19:14 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1100.eqiad.wmnet with reason: host reimage
* 19:13 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1101.eqiad.wmnet with reason: host reimage
* 19:13 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:13 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:10 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1100.eqiad.wmnet with reason: host reimage
* 18:59 inflatador: bking@deploy2002 restarting opensearch-semantic-search eqiad to renew certs
* 18:57 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1101.eqiad.wmnet with OS trixie
* 18:55 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1101.eqiad.wmnet with OS trixie
* 18:54 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1100.eqiad.wmnet with OS trixie
* 18:53 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1100.eqiad.wmnet with OS trixie
* 18:50 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 18:49 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 18:36 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on hcaptcha-proxy4002.wikimedia.org with reason: depooled host (soon to be decomed)
* 18:35 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on hcaptcha-proxy4001.wikimedia.org with reason: depooled host (soon to be decomed)
* 18:10 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 18:10 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 18:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>aqs[1011,1014,1016-1022]*<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 17:54 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 17:54 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1115.eqiad.wmnet with OS trixie
* 17:53 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase-eqiad
* 17:49 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259136{{!}}EventStreamConfig: Document not adding performer attributes to SI interaction v2 stream (T418740)]] (duration: 06m 28s)
* 17:45 dreamyjazz@deploy2002: kharlan, dreamyjazz: Continuing with sync
* 17:45 dreamyjazz@deploy2002: kharlan, dreamyjazz: Backport for [[gerrit:1259136{{!}}EventStreamConfig: Document not adding performer attributes to SI interaction v2 stream (T418740)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:43 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1259136{{!}}EventStreamConfig: Document not adding performer attributes to SI interaction v2 stream (T418740)]]
* 17:35 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:34 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1101.eqiad.wmnet with OS trixie
* 17:34 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1101.eqiad.wmnet [reason: trixie reimaging]
* 17:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:31 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1100.eqiad.wmnet with OS trixie
* 17:30 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1100.eqiad.wmnet [reason: trixie reimaging]
* 17:27 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:26 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:24 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:23 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:22 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:21 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 17:21 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:20 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 17:20 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:18 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:17 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:17 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:16 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 17:13 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:13 bd808@deploy2002: Finished deploy [releng/jenkins-deploy@f47af21] (releasing): jobs: Use TZ=UTC in branchMWSingleVersion.groovy trigger ([[phab:T404399|T404399]]) (duration: 01m 36s)
* 17:12 bd808@deploy2002: Started deploy [releng/jenkins-deploy@f47af21] (releasing): jobs: Use TZ=UTC in branchMWSingleVersion.groovy trigger ([[phab:T404399|T404399]])
* 17:12 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:09 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:08 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:08 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:08 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:07 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:06 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:04 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:04 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:03 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:02 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 17:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1001.eqiad.wmnet
* 16:56 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:56 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 14 hosts
* 16:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:55 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for 14 hosts
* 16:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1001.eqiad.wmnet
* 16:53 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:52 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:52 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:46 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:41 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:38 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin2001.codfw.wmnet
* 16:35 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 16:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:34 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin2001.codfw.wmnet
* 16:32 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 16:31 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:30 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:29 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1023.eqiad.wmnet
* 16:29 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1023.eqiad.wmnet
* 16:28 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:27 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:24 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1010.eqiad.wmnet
* 16:24 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1010.eqiad.wmnet
* 16:21 jgreen@dns1004: END - running authdns-update
* 16:19 jgreen@dns1004: START - running authdns-update
* 16:18 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:17 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Rebooting clouddb1023 [[phab:T419960|T419960]]
* 16:09 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1025.eqiad.wmnet
* 16:09 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1025.eqiad.wmnet
* 16:09 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1025.eqiad.wmnet
* 16:04 urandom: stopping aqs1010 for SSD replacement — [[phab:T420867|T420867]]
* 16:03 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:03 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on aqs1010.eqiad.wmnet with reason: Shutting down for SSD replacement — [[phab:T420867|T420867]]
* 15:58 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1025.eqiad.wmnet
* 15:57 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1025.eqiad.wmnet with reason: Rebooting clouddb1025 [[phab:T419960|T419960]]
* 15:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:53 topranks: disabling puppet for nftables-enabled machines to validate new ruleset on selected hosts before wider rollout [[phab:T420715|T420715]]
* 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:50 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:49 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:31 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1172.eqiad.wmnet
* 15:21 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 15:20 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 15:15 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1020.eqiad.wmnet
* 15:14 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1020.eqiad.wmnet
* 15:14 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet
* 15:05 btullis@cumin1003: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1172.eqiad.wmnet
* 15:03 btullis@cumin1003: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1172.eqiad.wmnet
* 15:03 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) opensearch-ipoid.discovery.wmnet on all recursors
* 15:03 bking@cumin2002: START - Cookbook sre.dns.wipe-cache opensearch-ipoid.discovery.wmnet on all recursors
* 15:03 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 15:01 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 15:01 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) opensearch-ipoid.discovery.wmnet on all recursors
* 14:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache opensearch-ipoid.discovery.wmnet on all recursors
* 14:58 sukhe@dns1004: END - running authdns-update
* 14:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) opensearch-test.discovery.wmnet on all recursors
* 14:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache opensearch-test.discovery.wmnet on all recursors
* 14:57 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet
* 14:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:56 sukhe@dns1004: START - running authdns-update
* 14:56 sukhe@dns1004: END - running authdns-update
* 14:56 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1020.eqiad.wmnet with reason: Rebooting clouddb1020 [[phab:T419960|T419960]]
* 14:56 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1019.eqiad.wmnet
* 14:56 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1019.eqiad.wmnet
* 14:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:55 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1019.eqiad.wmnet
* 14:55 sukhe@dns1004: START - running authdns-update
* 14:55 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase-eqiad
* 14:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:49 sukhe@dns1004: END - running authdns-update
* 14:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:48 sukhe@dns1004: START - running authdns-update
* 14:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.eqiad.wmnet
* 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 14:44 sukhe@dns1004: END - running authdns-update
* 14:43 sukhe@dns1004: START - running authdns-update
* 14:40 sukhe@dns1004: FAIL - running authdns-update
* 14:39 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.eqiad.wmnet
* 14:38 sukhe@dns1004: START - running authdns-update
* 14:37 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 14:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) k8s-ingress-dse-aa.discovery.wmnet on all recursors
* 14:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache k8s-ingress-dse-aa.discovery.wmnet on all recursors
* 14:34 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1019.eqiad.wmnet
* 14:34 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1019.eqiad.wmnet with reason: Rebooting clouddb1019 [[phab:T419960|T419960]]
* 14:33 sukhe@dns1004: FAIL - running authdns-update
* 14:33 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1002.eqiad.wmnet
* 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1018.eqiad.wmnet
* 14:33 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1018.eqiad.wmnet
* 14:32 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet
* 14:32 sukhe@dns1004: START - running authdns-update
* 14:31 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=codfw
* 14:30 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1172.eqiad.wmnet with OS bullseye
* 14:30 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:27 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1002.eqiad.wmnet
* 14:22 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1018.eqiad.wmnet with reason: Rebooting clouddb1018 [[phab:T419960|T419960]]
* 14:22 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1018.eqiad.wmnet
* 14:22 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1018.eqiad.wmnet
* 14:21 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet
* 14:20 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:17 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1001.eqiad.wmnet
* 14:14 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 14:14 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on db1253.eqiad.wmnet with reason: Under repair
* 14:11 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1001.eqiad.wmnet
* 14:07 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:04 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha2002.wikimedia.org
* 14:04 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:03 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2347-2351].codfw.wmnet
* 14:03 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2347-2351].codfw.wmnet
* 14:00 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha2002.wikimedia.org
* 14:00 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha2001.wikimedia.org
* 13:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1172.eqiad.wmnet with reason: host reimage
* 13:57 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2347-2351].codfw.wmnet
* 13:56 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha2001.wikimedia.org
* 13:56 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:55 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha1002.wikimedia.org
* 13:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1172.eqiad.wmnet with reason: host reimage
* 13:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2347-2351].codfw.wmnet
* 13:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:51 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha1002.wikimedia.org
* 13:51 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha1001.wikimedia.org
* 13:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:47 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha1001.wikimedia.org
* 13:47 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:43 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:43 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:43 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:42 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host an-worker1172.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:42 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 13:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2012.codfw.wmnet
* 13:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2011.codfw.wmnet
* 13:38 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb1002.eqiad.wmnet
* 13:36 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2012.codfw.wmnet
* 13:36 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2011.codfw.wmnet
* 13:36 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:30 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2003.codfw.wmnet
* 13:30 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1002.eqiad.wmnet
* 13:29 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudlb1002.eqiad.wmnet
* 13:29 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1002.eqiad.wmnet
* 13:28 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb1001.eqiad.wmnet
* 13:25 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:24 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2003.codfw.wmnet
* 13:21 jforrester@deploy2002: mwscript-k8s job started: extensions/WikimediaMaintenance/maintenance/createExtensionTables.php --wiki=abstractwiki translate # [[phab:T420656|T420656]]
* 13:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 13:20 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1001.eqiad.wmnet
* 13:20 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudlb1001.eqiad.wmnet
* 13:20 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1001.eqiad.wmnet
* 13:19 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:19 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudlb1001.eqiad.wmnet
* 13:18 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1001.eqiad.wmnet
* 13:17 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259035{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added (T420722)]], [[gerrit:1259036{{!}}tests: add coverage for WelcomeSurveyHooks::onCentralAuthPostLoginRedirect (T420722)]], [[gerrit:1259046{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added 2 (T420722)]] (duration: 11m 43s)
* 13:16 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:14 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:13 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:11 sgimeno@deploy2002: sgimeno: Continuing with sync
* 13:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2005-2006,2011-2018,2033-2039,2041-2042,2044,2046,2049-2051,2055-2062,2064-2065,2067-2078,2087-2095,2102-2115,2124-2179,2184-2199].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 13:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2186-2199].codfw.wmnet
* 13:08 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2186-2199].codfw.wmnet
* 13:07 sgimeno@deploy2002: sgimeno: Backport for [[gerrit:1259035{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added (T420722)]], [[gerrit:1259036{{!}}tests: add coverage for WelcomeSurveyHooks::onCentralAuthPostLoginRedirect (T420722)]], [[gerrit:1259046{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added 2 (T420722)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Ch
* 13:05 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1259035{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added (T420722)]], [[gerrit:1259036{{!}}tests: add coverage for WelcomeSurveyHooks::onCentralAuthPostLoginRedirect (T420722)]], [[gerrit:1259046{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added 2 (T420722)]]
* 12:43 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "bast4006 - ayounsi@cumin1003"
* 12:42 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "bast4006 - ayounsi@cumin1003"
* 12:42 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2006.codfw.wmnet with OS bookworm
* 12:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host bast4006.wikimedia.org
* 12:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast4006.wikimedia.org with OS bookworm
* 12:34 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:28 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
* 12:22 cmooney@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
* 12:18 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 12:16 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2186-2199].codfw.wmnet
* 12:14 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 12:08 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS bookworm
* 12:07 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2186-2199].codfw.wmnet
* 12:04 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 12:04 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 11:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 11:44 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 11:40 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:40 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:23 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2140-2153].codfw.wmnet
* 11:20 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS bookworm
* 11:20 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2140-2153].codfw.wmnet
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:19 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) bast4006.wikimedia.org on all recursors
* 11:19 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache bast4006.wikimedia.org on all recursors
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:19 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:15 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 11:15 ayounsi@cumin1003: START - Cookbook sre.ganeti.makevm for new host bast4006.wikimedia.org
* 11:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts install4003.wikimedia.org
* 11:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install4003.wikimedia.org decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
* 11:08 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install4003.wikimedia.org decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
* 11:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2140-2153].codfw.wmnet
* 11:05 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 11:00 ayounsi@cumin1003: START - Cookbook sre.hosts.decommission for hosts install4003.wikimedia.org
* 10:57 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:55 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2140-2153].codfw.wmnet
* 10:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:55 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:44 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:43 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:38 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 10:38 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:38 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:38 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 10:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:28 topranks: disable puppet on routed-ganeti hosts to test nftables update on specific nodes [[phab:T420715|T420715]]
* 10:27 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s1
* 10:25 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s1
* 10:25 ayounsi@dns1004: END - running authdns-update
* 10:24 ayounsi@dns1004: START - running authdns-update
* 10:23 btullis@cumin1003: START - Cookbook sre.hosts.provision for host an-worker1172.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 10:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:20 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s4
* 10:18 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s4
* 10:13 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s8
* 10:11 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s8
* 10:09 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1172.eqiad.wmnet with OS bullseye
* 10:08 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:05 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s7
* 10:05 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 10:04 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 10:04 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s7
* 09:58 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s3
* 09:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:57 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s3
* 09:53 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:53 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 09:52 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s2
* 09:49 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s2
* 09:49 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 09:49 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 09:49 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 09:49 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 09:44 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 09:44 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s5
* 09:44 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 09:44 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 09:42 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s5
* 09:40 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:39 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:33 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s6
* 09:32 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s6
* 09:29 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:29 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:25 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 09:25 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 09:24 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section es7
* 09:23 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section es7
* 09:22 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 09:22 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section es6
* 09:16 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section es6
* 09:11 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:11 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:10 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section x3
* 09:09 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section x3
* 09:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 09:05 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:02 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section x1
* 09:01 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section x1
* 09:00 federico3: starting [[phab:T416706|T416706]]
* 09:00 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 08:59 fceratto@cumin1003: END (FAIL) - Cookbook sre.switchdc.databases.prepare (exit_code=99) for the switch from eqiad to codfw for section test-s4
* 08:59 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from eqiad to codfw for section test-s4
* 08:59 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:59 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:50 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:46 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255736{{!}}hcaptcha: Use the global edit key for MobileFrontend edits if present (T420574)]] (duration: 14m 42s)
* 08:44 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 08:43 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 08:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:40 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:40 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:40 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:39 kharlan@deploy2002: kharlan: Continuing with sync
* 08:38 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:37 kharlan@deploy2002: kharlan: Backport for [[gerrit:1255736{{!}}hcaptcha: Use the global edit key for MobileFrontend edits if present (T420574)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 08:31 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1255736{{!}}hcaptcha: Use the global edit key for MobileFrontend edits if present (T420574)]]
* 08:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:19 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:18 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2005-2006,2011-2018,2033-2039,2041-2042,2044,2046,2049-2051,2055-2062,2064-2065,2067-2078,2087-2095,2102-2115,2124-2179,2184-2199].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 07:45 kartik@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254149{{!}}Enable ULS rewrite beta feature (T418187 T253303)]] (duration: 41m 30s)
* 07:42 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:33 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:30 kartik@deploy2002: kartik, abi: Continuing with sync
* 07:30 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:29 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:22 kartik@deploy2002: kartik, abi: Backport for [[gerrit:1254149{{!}}Enable ULS rewrite beta feature (T418187 T253303)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 07:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 07:17 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 07:16 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 07:03 kartik@deploy2002: Started scap sync-world: Backport for [[gerrit:1254149{{!}}Enable ULS rewrite beta feature (T418187 T253303)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 55s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-22 ==
* 02:50 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh7004.wikimedia.org with reason: depooled host
* 02:50 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh7003.wikimedia.org with reason: depooled host
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 21s)
* 02:01 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-20 ==
* 23:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2013.codfw.wmnet
* 23:30 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs2013.codfw.wmnet
* 22:34 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host lvs2013.codfw.wmnet
* 22:34 brett: Started pybal on lvs2013
* 22:27 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 21:57 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: trixie reimaging]
* 21:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5023.eqsin.wmnet with OS trixie
* 21:55 hashar: Upgrading CI Jenkins [[phab:T420477|T420477]]
* 21:25 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5023.eqsin.wmnet with reason: host reimage
* 21:21 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5023.eqsin.wmnet with reason: host reimage
* 21:04 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: debugging ipip
* 20:46 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5023.eqsin.wmnet with OS trixie
* 20:45 mutante: contint1003/2003 apt remove --purge apache2* ; apt remove --purge php* {{!}} [[phab:T418521|T418521]]
* 20:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 20:40 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 20:38 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5023.eqsin.wmnet with OS trixie
* 20:24 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh3006.wikimedia.org with reason: depooled host
* 20:24 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh3005.wikimedia.org with reason: depooled host
* 20:23 sukhe@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on doh3005.wikimedia.org with reason: depooled host
* 19:50 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: debugging ipip
* 19:33 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 19:30 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 19:21 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling reboot on A:tcpproxy and A:tcpproxy
* 19:16 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5023.eqsin.wmnet with OS trixie
* 19:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5023.eqsin.wmnet [reason: trixie reimaging]
* 19:16 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5021.eqsin.wmnet [reason: trixie reimaging]
* 19:14 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5021.eqsin.wmnet with OS trixie
* 18:52 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: reboot
* 18:43 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5021.eqsin.wmnet with reason: host reimage
* 18:39 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5021.eqsin.wmnet with reason: host reimage
* 18:28 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: reboot
* 18:16 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling reboot on A:tcpproxy and A:tcpproxy
* 18:14 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on db1253.eqiad.wmnet with reason: [[phab:T420041|T420041]]
* 17:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5021.eqsin.wmnet with OS trixie
* 17:54 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5021.eqsin.wmnet with OS trixie
* 17:51 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host lvs2014.codfw.wmnet
* 17:40 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on contint1003.wikimedia.org with reason: jenkins on java21
* 17:39 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet
* 16:54 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:54 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:33 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5021.eqsin.wmnet with OS trixie
* 16:32 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5021.eqsin.wmnet [reason: trixie reimaging]
* 16:09 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:08 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2002.codfw.wmnet
* 16:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2002.codfw.wmnet
* 15:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2001.codfw.wmnet
* 15:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:45 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2001.codfw.wmnet
* 15:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2041.codfw.wmnet
* 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2041.codfw.wmnet
* 15:32 cparle@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:32 cparle@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 15:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2040.codfw.wmnet
* 15:10 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2040.codfw.wmnet
* 15:02 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 15:01 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 15:00 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:59 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:58 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:58 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:57 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:56 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:55 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:50 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2002].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2002.codfw.wmnet
* 14:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2002.codfw.wmnet
* 14:44 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:44 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2002.codfw.wmnet
* 14:37 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2002.codfw.wmnet
* 14:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2001.codfw.wmnet
* 14:37 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2001.codfw.wmnet
* 14:36 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:35 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:35 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:34 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:30 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2001.codfw.wmnet
* 14:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2001.codfw.wmnet
* 14:29 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2002].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 14:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1335-1349].eqiad.wmnet
* 14:27 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1335-1349].eqiad.wmnet
* 14:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2039.codfw.wmnet
* 14:21 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2039.codfw.wmnet
* 14:16 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:16 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2038.codfw.wmnet
* 14:08 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2038.codfw.wmnet
* 13:54 jgreen@dns1004: END - running authdns-update
* 13:52 jgreen@dns1004: START - running authdns-update
* 13:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:39 inflatador: bking@deploy2002 restarting opensearch-ipoid cluster to apply new certificates
* 13:33 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 13:20 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:14 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-canary
* 13:14 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for doh[3005-3006].wikimedia.org
* 13:14 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for doh[3005-3006].wikimedia.org
* 13:08 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-canary
* 13:05 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:03 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:58 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2006.codfw.wmnet
* 12:56 cparle@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 12:55 cparle@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 12:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2006.codfw.wmnet
* 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2001.codfw.wmnet
* 12:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2001.codfw.wmnet
* 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1005.eqiad.wmnet
* 12:35 jiji@cumin1003: END (ERROR) - Cookbook sre.memcached.roll-reboot-restart (exit_code=97) rolling reboot on A:memcached-codfw
* 12:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1005.eqiad.wmnet
* 12:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw2001.wikimedia.org
* 12:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw2001.wikimedia.org
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw1001.wikimedia.org
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw1001.wikimedia.org
* 11:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 11:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 11:27 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 11:24 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 10:26 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 10:13 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:12 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:10 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:04 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:02 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:58 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:56 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:56 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:55 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:53 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:50 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:46 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:45 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:37 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:36 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:36 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:35 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:35 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:34 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:34 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:33 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:26 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:25 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:24 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:23 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:19 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:19 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:18 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:18 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:18 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:15 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:57 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 05:30 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5024.eqsin.wmnet [reason: trixie reimaging]
* 05:30 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5019.eqsin.wmnet [reason: trixie reimaging]
* 02:43 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on doh3005.wikimedia.org with reason: alerting is flapping
* 02:42 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on doh3006.wikimedia.org with reason: alerting is flapping
* 01:21 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5019.eqsin.wmnet with OS trixie
* 01:17 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5024.eqsin.wmnet with OS trixie
* 00:48 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
* 00:44 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 00:38 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
* 00:37 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 00:05 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
* 00:01 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5024.eqsin.wmnet with OS trixie
* 00:01 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
== 2026-03-19 ==
* 23:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5019.eqsin.wmnet with OS trixie
* 23:40 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255801{{!}}Make the handler follow the thumb steps (T414805)]] (duration: 06m 14s)
* 23:36 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 23:35 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1255801{{!}}Make the handler follow the thumb steps (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:33 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1255801{{!}}Make the handler follow the thumb steps (T414805)]]
* 22:48 zabe@deploy2002: mwscript-k8s job started: foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https # [[phab:T420643|T420643]]
* 22:19 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 22:18 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 22:17 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 22:16 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 22:08 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255886{{!}}Set WikiLambdaAbstractNamespaces's merge_strategy to provide_default (T420649)]] (duration: 06m 46s)
* 22:04 jforrester@deploy2002: jforrester: Continuing with sync
* 22:03 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255886{{!}}Set WikiLambdaAbstractNamespaces's merge_strategy to provide_default (T420649)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:01 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255886{{!}}Set WikiLambdaAbstractNamespaces's merge_strategy to provide_default (T420649)]]
* 21:58 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5019.eqsin.wmnet with OS trixie
* 21:57 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase-codfw
* 21:57 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5019.eqsin.wmnet [reason: trixie reimaging]
* 21:56 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
* 21:56 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5024.eqsin.wmnet [reason: trixie reimaging]
* 21:55 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255765{{!}}Implement addListener fallback for older browsers in matchMedia (T419717)]] (duration: 07m 17s)
* 21:51 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:49 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1255765{{!}}Implement addListener fallback for older browsers in matchMedia (T419717)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:48 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255765{{!}}Implement addListener fallback for older browsers in matchMedia (T419717)]]
* 21:29 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255881{{!}}Skins: Address issue with blurry images for large thumbnails (T375981)]] (duration: 07m 03s)
* 21:25 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:24 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1255881{{!}}Skins: Address issue with blurry images for large thumbnails (T375981)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:22 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255881{{!}}Skins: Address issue with blurry images for large thumbnails (T375981)]]
* 21:11 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup2020.codfw.wmnet with reason: kernel module reload
* 21:10 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 11 hosts with reason: kernel module reload
* 20:36 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254865{{!}}Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654{{!}}Deploy PRV to 13 wikis (T420273)]] (duration: 11m 00s)
* 20:32 kgraessle@deploy2002: kgraessle, arlolra: Continuing with sync
* 20:27 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1016.eqiad.wmnet
* 20:27 kgraessle@deploy2002: kgraessle, arlolra: Backport for [[gerrit:1254865{{!}}Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654{{!}}Deploy PRV to 13 wikis (T420273)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1254865{{!}}Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654{{!}}Deploy PRV to 13 wikis (T420273)]]
* 20:25 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1016.eqiad.wmnet
* 20:11 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs1016.eqiad.wmnet with reason: reboot
* 20:01 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:01 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add analytic vlan hostnames - cmooney@cumin1003"
* 20:01 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add analytic vlan hostnames - cmooney@cumin1003"
* 19:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1018.eqiad.wmnet
* 19:56 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:56 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1018.eqiad.wmnet
* 19:55 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:53 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 4.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.8.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa. on all recursors
* 19:53 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache 4.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.8.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa. on all recursors
* 19:52 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:51 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 7 hosts with reason: kernel module reload
* 19:44 topranks: disable IPv6 VRRP for et-1/0/5.1023 sub-interfaces on eqiad core routers [[phab:T405562|T405562]]
* 19:36 brett: stopping pybal/puppet on lvs1018 for reboots
* 19:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: reboots
* 19:00 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 6 hosts with reason: kernel module reload
* 19:00 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1019.eqiad.wmnet
* 19:00 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase-codfw
* 19:00 topranks: add vlan sub-interface for analytics1-d-eqiad vlan to leaf switches in eqiad row d [[phab:T405562|T405562]]
* 18:44 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs1019.eqiad.wmnet with reason: planned reboot
* 18:42 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:aqs-codfw
* 18:31 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255824{{!}}RepoBooks::onMediaWikiServices: Skip all low NSes, not just NS0 (T420617)]], [[gerrit:1255820{{!}}SpecialAbstractContent: Fix hard-coded policy list page namespace]], [[gerrit:1255794{{!}}[abstractwiki] Allow "Abstract:" as well as "Abstract Wikipedia:" as NS_PROJECT]] (duration: 06m 20s)
* 18:27 jforrester@deploy2002: jforrester: Continuing with sync
* 18:26 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255824{{!}}RepoBooks::onMediaWikiServices: Skip all low NSes, not just NS0 (T420617)]], [[gerrit:1255820{{!}}SpecialAbstractContent: Fix hard-coded policy list page namespace]], [[gerrit:1255794{{!}}[abstractwiki] Allow "Abstract:" as well as "Abstract Wikipedia:" as NS_PROJECT]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now b
* 18:24 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255824{{!}}RepoBooks::onMediaWikiServices: Skip all low NSes, not just NS0 (T420617)]], [[gerrit:1255820{{!}}SpecialAbstractContent: Fix hard-coded policy list page namespace]], [[gerrit:1255794{{!}}[abstractwiki] Allow "Abstract:" as well as "Abstract Wikipedia:" as NS_PROJECT]]
* 18:02 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
* 17:55 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 17:46 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:46 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:45 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host lvs1020.eqiad.wmnet
* 17:44 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 17:30 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 17:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy4004.wikimedia.org
* 17:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy4004.wikimedia.org with OS bookworm
* 17:26 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on contint1003.wikimedia.org with reason: jenkins on java21
* 17:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp5026.eqsin.wmnet
* 17:22 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:22 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:22 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:22 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:21 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp5026.eqsin.wmnet
* 17:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy4004.wikimedia.org with reason: host reimage
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts doh4002.wikimedia.org
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 17:08 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy4004.wikimedia.org with reason: host reimage
* 17:08 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 17:07 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:07 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:05 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp5026.eqsin.wmnet with reason: firmware updates
* 17:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 17:03 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5025.*
* 17:01 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp5025.eqsin.wmnet
* 16:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts doh4002.wikimedia.org
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts doh4001.wikimedia.org
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 16:59 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 16:58 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:57 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:57 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:56 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp5025.eqsin.wmnet
* 16:50 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts doh4001.wikimedia.org
* 16:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-master1004.eqiad.wmnet
* 16:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1151.eqiad.wmnet
* 16:45 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5026.eqsin.wmnet with OS trixie
* 16:44 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy4004.wikimedia.org with OS bookworm
* 16:44 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy4004.wikimedia.org on all recursors
* 16:43 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy4004.wikimedia.org on all recursors
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:42 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:42 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255039{{!}}Activate Abstract Wikipedia (T411723)]] (duration: 06m 09s)
* 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp5025.eqsin.wmnet with reason: firmware updates
* 16:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-master1004.eqiad.wmnet
* 16:40 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5025.eqsin.wmnet with OS trixie
* 16:39 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1151.eqiad.wmnet
* 16:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:39 jmm@cumin2002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 16:38 jforrester@deploy2002: jforrester: Continuing with sync
* 16:38 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255039{{!}}Activate Abstract Wikipedia (T411723)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:36 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255039{{!}}Activate Abstract Wikipedia (T411723)]]
* 16:35 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
* 16:33 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255779{{!}}Revert "[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki"]] (duration: 07m 19s)
* 16:29 jforrester@deploy2002: jforrester: Continuing with sync
* 16:28 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255779{{!}}Revert "[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:26 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255779{{!}}Revert "[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki"]]
* 16:25 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:24 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255773{{!}}[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki (T420531)]] (duration: 06m 06s)
* 16:23 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:aqs-codfw
* 16:21 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 16:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 16:20 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:20 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4004.wikimedia.org
* 16:20 fabfur@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=1) rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp2041*<nowiki>}</nowiki> and A:cp - 3.2 test upgrade ()
* 16:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp2041*<nowiki>}</nowiki> and A:cp - 3.2 test upgrade ()
* 16:20 jforrester@deploy2002: jforrester: Continuing with sync
* 16:19 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255773{{!}}[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki (T420531)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2003.codfw.wmnet
* 16:17 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255773{{!}}[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki (T420531)]]
* 16:17 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 16:17 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 16:17 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy4003.wikimedia.org
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy4003.wikimedia.org with OS bookworm
* 16:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1142.eqiad.wmnet
* 16:14 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk2003.codfw.wmnet
* 16:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2002.codfw.wmnet
* 16:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 16:10 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk2002.codfw.wmnet
* 16:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5025.eqsin.wmnet with reason: host reimage
* 16:08 brouberol@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:07 brouberol@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1142.eqiad.wmnet
* 16:06 brouberol@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2001.codfw.wmnet
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5025.eqsin.wmnet with reason: host reimage
* 16:05 brouberol@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk2001.codfw.wmnet
* 15:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy4003.wikimedia.org with reason: host reimage
* 15:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy4003.wikimedia.org with reason: host reimage
* 15:36 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief2002.codfw.wmnet
* 15:35 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5025.eqsin.wmnet with OS trixie
* 15:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5026.eqsin.wmnet with OS trixie
* 15:34 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5025.eqsin.wmnet with OS trixie
* 15:34 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5026.eqsin.wmnet with OS trixie
* 15:32 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief2002.codfw.wmnet
* 15:32 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief1002.eqiad.wmnet
* 15:32 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy4003.wikimedia.org with OS bookworm
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy4003.wikimedia.org on all recursors
* 15:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy4003.wikimedia.org on all recursors
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test2001.codfw.wmnet
* 15:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test1001.eqiad.wmnet
* 15:28 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief1002.eqiad.wmnet
* 15:26 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test2001.codfw.wmnet
* 15:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test1001.eqiad.wmnet
* 15:22 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255747{{!}}Hooks: Re-apply I52fc151ab88d79754baeff35d2c0f200ebe9fc9a]] (duration: 09m 55s)
* 15:22 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:22 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:22 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:21 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:21 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:21 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:18 phuedx@deploy2002: phuedx: Continuing with sync
* 15:18 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:17 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:17 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 15:16 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 15:16 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 15:14 phuedx@deploy2002: phuedx: Backport for [[gerrit:1255747{{!}}Hooks: Re-apply I52fc151ab88d79754baeff35d2c0f200ebe9fc9a]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 15:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4003.wikimedia.org
* 15:12 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1255747{{!}}Hooks: Re-apply I52fc151ab88d79754baeff35d2c0f200ebe9fc9a]]
* 15:11 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 15:10 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 15:10 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 15:09 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 15:09 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 15:09 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 15:07 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5026.eqsin.wmnet with OS trixie
* 15:06 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5025.eqsin.wmnet with OS trixie
* 15:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh4004.wikimedia.org
* 15:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh4004.wikimedia.org with OS bookworm
* 15:05 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk1003.eqiad.wmnet
* 15:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk1003.eqiad.wmnet
* 14:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk1002.eqiad.wmnet
* 14:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host matomo1003.eqiad.wmnet
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zookeeper-test1002.eqiad.wmnet
* 14:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk1002.eqiad.wmnet
* 14:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk1001.eqiad.wmnet
* 14:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host matomo1003.eqiad.wmnet
* 14:52 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host zookeeper-test1002.eqiad.wmnet
* 14:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk1001.eqiad.wmnet
* 14:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1006.eqiad.wmnet
* 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh4004.wikimedia.org with reason: host reimage
* 14:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-conf1006.eqiad.wmnet
* 14:43 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh4004.wikimedia.org with reason: host reimage
* 14:40 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 14:38 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 14:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1005.eqiad.wmnet
* 14:32 bking@cumin2002: conftool action : set/pooled=yes:weight=1; selector: name=dse-k8s-worker1010.eqiad.wmnet{{!}}dse-k8s-worker1011.eqiad.wmnet{{!}}dse-k8s-worker1012.eqiad.wmnet{{!}}dse-k8s-worker1013.eqiad.wmnet{{!}}dse-k8s-worker1015.eqiad.wmnet{{!}}dse-k8s-worker1016.eqiad.wmnet{{!}}dse-k8s-worker1017.eqiad.wmnet{{!}}dse-k8s-worker1018.eqiad.wmnet{{!}}dse-k8s-worker1019.eqiad.wmnet
* 14:29 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-conf1005.eqiad.wmnet
* 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1004.eqiad.wmnet
* 14:25 bking@cumin2002: conftool action : set/pooled=yes:weight=10; selector: name=dse-k8s-worker1012.eqiad.wmnet{{!}}dse-k8s-worker1015.eqiad.wmnet{{!}}dse-k8s-worker1016.eqiad.wmnet{{!}}dse-k8s-worker1017.eqiad.wmnet
* 14:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-conf1004.eqiad.wmnet
* 14:21 jmm@deploy2002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 14:20 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host doh4004.wikimedia.org with OS bookworm
* 14:20 jmm@deploy2002: helmfile [eqiad] START helmfile.d/services/proton: apply
* 14:19 jmm@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 14:18 jmm@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
* 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:17 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4004.wikimedia.org on all recursors
* 14:17 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4004.wikimedia.org on all recursors
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:13 jmm@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
* 14:12 jmm@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
* 14:11 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:04 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4004.wikimedia.org
* 14:03 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh4003.wikimedia.org
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh4003.wikimedia.org with OS bookworm
* 13:53 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:46 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250107{{!}}Expose new wikifunctions.v0 REST API module on Wikifunctions.org only (T419053)]] (duration: 06m 03s)
* 13:42 jforrester@deploy2002: jforrester: Continuing with sync
* 13:42 jforrester@deploy2002: jforrester: Backport for [[gerrit:1250107{{!}}Expose new wikifunctions.v0 REST API module on Wikifunctions.org only (T419053)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:40 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1250107{{!}}Expose new wikifunctions.v0 REST API module on Wikifunctions.org only (T419053)]]
* 13:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh4003.wikimedia.org with reason: host reimage
* 13:33 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh4003.wikimedia.org with reason: host reimage
* 13:22 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255686{{!}}CreateAccount: Add class to aide in instrumentation]], [[gerrit:1255685{{!}}createAccount: Log exposure and CTRs for account creation experiment (T419916)]] (duration: 12m 58s)
* 13:22 moritzm: upgrade rpki1001 to Routinator 0.15.1 [[phab:T420572|T420572]]
* 13:15 urbanecm@deploy2002: migr, urbanecm: Continuing with sync
* 13:13 urbanecm@deploy2002: migr, urbanecm: Backport for [[gerrit:1255686{{!}}CreateAccount: Add class to aide in instrumentation]], [[gerrit:1255685{{!}}createAccount: Log exposure and CTRs for account creation experiment (T419916)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:12 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host doh4003.wikimedia.org with OS bookworm
* 13:09 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1255686{{!}}CreateAccount: Add class to aide in instrumentation]], [[gerrit:1255685{{!}}createAccount: Log exposure and CTRs for account creation experiment (T419916)]]
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4003.wikimedia.org - jmm@cumin2002"
* 13:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4003.wikimedia.org - jmm@cumin2002"
* 13:07 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 17 hosts with reason: upgrade
* 13:01 moritzm: installing rsync security updates
* 12:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1017.eqiad.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm7001.magru.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:59 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 12:57 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 12:54 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 12:53 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1017.eqiad.wmnet
* 12:53 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1016.eqiad.wmnet
* 12:52 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 12:52 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 12:51 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 12:51 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 12:50 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 12:50 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 12:50 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 12:50 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 12:49 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 12:48 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 12:48 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 12:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1016.eqiad.wmnet
* 12:47 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:46 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:46 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 12:46 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:46 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 12:43 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 12:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 12:43 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb2002.codfw.wmnet
* 12:41 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 12:41 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm7001.magru.wmnet
* 12:41 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 12:41 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 12:40 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 12:40 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 12:39 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 12:39 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 12:38 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 12:37 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 12:37 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:37 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
* 12:37 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4003.wikimedia.org
* 12:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb2002.codfw.wmnet
* 12:29 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:27 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:25 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:25 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:24 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:23 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:22 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:22 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:10 urbanecm@deploy2002: mwscript-k8s job started: GrowthExperiments:reassignMentees --wiki=enwiki --mentor=Bilorv --performer=Bilorv --as-job # [[phab:T418194|T418194]]
* 11:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 11:59 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:58 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
* 11:57 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 11:53 moritzm: upgrade rpki2003 to Routinator 0.15.1 [[phab:T420572|T420572]]
* 11:46 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:40 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:40 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1017.eqiad.wmnet with reason: host reimage
* 11:26 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1017.eqiad.wmnet with reason: host reimage
* 11:18 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 11:18 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
* 11:11 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 11:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-codfw
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 10:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 10:55 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5]*<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 10:54 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host doh4003.wikimedia.org
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 10:54 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2004.codfw.wmnet
* 10:51 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 10:50 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 10:50 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:50 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2004.codfw.wmnet
* 10:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2005.codfw.wmnet
* 10:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema2004.codfw.wmnet
* 10:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2005.codfw.wmnet
* 10:43 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:43 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4003.wikimedia.org
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4007.ulsfo.wmnet to cluster ulsfo02 and group 01
* 10:42 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4007.ulsfo.wmnet to cluster ulsfo02 and group 01
* 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema2004.codfw.wmnet
* 10:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema2003.codfw.wmnet
* 10:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema2003.codfw.wmnet
* 10:37 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1002.wikimedia.org
* 10:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema1004.eqiad.wmnet
* 10:36 Raine: created temporary categorylinks_icu72 tables -- [[phab:T419980|T419980]], [[phab:T419049|T419049]]
* 10:36 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 10:34 aokoth@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:33 aokoth@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:32 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema1004.eqiad.wmnet
* 10:32 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:31 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4007.ulsfo.wmnet
* 10:29 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5]*<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 10:28 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:28 fnegri@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1002.wikimedia.org
* 10:26 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2006.codfw.wmnet
* 10:25 btullis@cumin1003: END (PASS) - Cookbook sre.opensearch.roll-restart-reboot (exit_code=0) rolling reboot on A:datahubsearch
* 10:24 btullis@cumin1003: END (FAIL) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=99) rolling reboot on A:cephosd-eqiad
* 10:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4007.ulsfo.wmnet
* 10:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2006.codfw.wmnet
* 10:21 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1001.wikimedia.org
* 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2008.wikimedia.org
* 10:19 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:18 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2008.wikimedia.org
* 10:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2007.codfw.wmnet
* 10:13 fnegri@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1001.wikimedia.org
* 10:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2007.codfw.wmnet
* 10:09 btullis@cumin1003: START - Cookbook sre.opensearch.roll-restart-reboot rolling reboot on A:datahubsearch
* 10:04 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:03 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4007.ulsfo.wmnet with OS bookworm
* 09:58 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 17 hosts with reason: upgrade
* 09:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema1003.eqiad.wmnet
* 09:53 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema1003.eqiad.wmnet
* 09:46 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]] (duration: 01m 07s)
* 09:45 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]]
* 09:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 09:43 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]] (duration: 00m 59s)
* 09:42 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]]
* 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4007.ulsfo.wmnet with reason: host reimage
* 09:35 moritzm: installing libnginx-mod-http-lua security updates
* 09:35 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4007.ulsfo.wmnet with reason: host reimage
* 09:29 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 09:26 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 09:26 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 09:24 klausman@cumin2002: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-codfw
* 09:21 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:21 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:19 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:19 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:13 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4007.ulsfo.wmnet with OS bookworm
* 09:11 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 09:01 moritzm: remove ganeti4007 from classic Ganeti cluster in ulsfo [[phab:T418993|T418993]]
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of doh4001.wikimedia.org to plain
* 08:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of doh4001.wikimedia.org to plain
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of doh4002.wikimedia.org to plain
* 08:46 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of doh4002.wikimedia.org to plain
* 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4001.wikimedia.org to plain
* 08:45 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4001.wikimedia.org to plain
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4002.wikimedia.org to plain
* 08:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4002.wikimedia.org to plain
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of install4003.wikimedia.org to plain
* 08:42 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of install4003.wikimedia.org to plain
* 08:40 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:38 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host doh4003.wikimedia.org
* 08:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 08:38 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 08:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:38 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:38 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 08:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:31 moritzm: installing python-apt security updates
* 08:29 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:29 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4003.wikimedia.org
* 08:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 08:14 moritzm: installing imagemagick security updates on Bullseye
* 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
* 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
* 08:12 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 08:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
* 07:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
* 07:17 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 07:17 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 07:16 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 07:16 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 07:14 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 07:14 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 04:53 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 00:06 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon2003.codfw.wmnet
* 00:02 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon2003.codfw.wmnet
* 00:01 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon1003.eqiad.wmnet
== 2026-03-18 ==
* 23:58 mutante: releases2003 - kill 782 (stunnel4) - systemctl start stunnel4 - fix [[phab:T420246|T420246]] [[phab:T420388|T420388]] [[phab:T420411|T420411]]
* 23:57 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon1003.eqiad.wmnet
* 23:49 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:cassandra-dev
* 23:23 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:cassandra-dev
* 23:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5017.*
* 23:02 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5020.*
* 23:01 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5028.*
* 22:40 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5028.eqsin.wmnet with OS trixie
* 22:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS trixie
* 22:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 22:04 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 21:51 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-eqiad
* 21:49 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox
* 21:49 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7002.wikimedia.org
* 21:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 21:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5027.*
* 21:40 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 21:31 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 21:30 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5028.eqsin.wmnet with OS trixie
* 21:30 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7002.wikimedia.org
* 21:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5027.eqsin.wmnet with OS trixie
* 21:27 jforrester@deploy2002: mwscript-k8s job started: extensions/WikimediaMaintenance/maintenance/addWiki.php --wiki=abstractwiki # [[phab:T411723|T411723]] addWiki.php run
* 21:26 jforrester@deploy2002: mwscript-k8s job started: extensions/WikimediaMaintenance/maintenance/addWiki.php --wiki=abstractwiki # [[phab:T411723|T411723]] addWiki.php run
* 21:24 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255034{{!}}Revert "OrchestratorRequest: Switch evaluations to v2 endpoint" (T418887)]], [[gerrit:1247650{{!}}Create Abstract Wikipedia (T411725 T411726)]] (duration: 06m 44s)
* 21:20 jforrester@deploy2002: jforrester: Continuing with sync
* 21:20 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255034{{!}}Revert "OrchestratorRequest: Switch evaluations to v2 endpoint" (T418887)]], [[gerrit:1247650{{!}}Create Abstract Wikipedia (T411725 T411726)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:17 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255034{{!}}Revert "OrchestratorRequest: Switch evaluations to v2 endpoint" (T418887)]], [[gerrit:1247650{{!}}Create Abstract Wikipedia (T411725 T411726)]]
* 21:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5017.eqsin.wmnet with OS trixie
* 21:15 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7001.wikimedia.org
* 21:12 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS trixie
* 21:08 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255013{{!}}Guard for JS null deref on empty Parsoid sections (T419721)]], [[gerrit:1254889{{!}}Reapply "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] (duration: 11m 20s)
* 21:07 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 21:07 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5028.eqsin.wmnet with OS trixie
* 21:04 jdlrobson@deploy2002: jdlrobson, harroyo-wmf: Continuing with sync
* 20:59 jdlrobson@deploy2002: jdlrobson, harroyo-wmf: Backport for [[gerrit:1255013{{!}}Guard for JS null deref on empty Parsoid sections (T419721)]], [[gerrit:1254889{{!}}Reapply "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:59 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-eqiad
* 20:58 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7001.wikimedia.org
* 20:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5027.eqsin.wmnet with reason: host reimage
* 20:57 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255013{{!}}Guard for JS null deref on empty Parsoid sections (T419721)]], [[gerrit:1254889{{!}}Reapply "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]]
* 20:52 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-codfw
* 20:51 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5027.eqsin.wmnet with reason: host reimage
* 20:51 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in1001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:50 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5020.eqsin.wmnet with OS trixie
* 20:50 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in2001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:49 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 20:48 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5028.eqsin.wmnet with OS trixie
* 20:44 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5017.eqsin.wmnet with reason: host reimage
* 20:43 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns6002.wikimedia.org
* 20:42 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out1001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:42 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1033.eqiad.wmnet with OS trixie
* 20:42 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 20:42 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 20:38 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254956{{!}}Limit legacy postprocessing cache to pages where DT does apply (T376183)]] (duration: 13m 54s)
* 20:37 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5017.eqsin.wmnet with reason: host reimage
* 20:35 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out2001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:34 cscott@deploy2002: cscott: Continuing with sync
* 20:33 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns6002.wikimedia.org
* 20:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and not P<nowiki>{</nowiki>cp2042.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 20:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
* 20:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and not P<nowiki>{</nowiki>cp2041.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 20:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
* 20:26 cscott@deploy2002: cscott: Backport for [[gerrit:1254956{{!}}Limit legacy postprocessing cache to pages where DT does apply (T376183)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1033.eqiad.wmnet with reason: host reimage
* 20:24 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1254956{{!}}Limit legacy postprocessing cache to pages where DT does apply (T376183)]]
* 20:24 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5027.eqsin.wmnet with OS trixie
* 20:22 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5029.*
* 20:21 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5029.eqsin.wmnet with OS trixie
* 20:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1033.eqiad.wmnet with reason: host reimage
* 20:18 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns6001.wikimedia.org
* 20:14 kemayo@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254965{{!}}Editcheck: fix tagging not happening for non-default checks]] (duration: 06m 28s)
* 20:10 kemayo@deploy2002: kemayo: Continuing with sync
* 20:10 kemayo@deploy2002: kemayo: Backport for [[gerrit:1254965{{!}}Editcheck: fix tagging not happening for non-default checks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1033.eqiad.wmnet with OS trixie
* 20:08 kemayo@deploy2002: Started scap sync-world: Backport for [[gerrit:1254965{{!}}Editcheck: fix tagging not happening for non-default checks]]
* 20:06 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 20:05 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns6001.wikimedia.org
* 20:05 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5017.eqsin.wmnet with OS trixie
* 20:05 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.*
* 20:05 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw
* 19:51 Reedy: running `foreachwikiindblist fishbowl.dblist extensions/OATHAuth/maintenance/UpdateSecretsToEncryptedFormat.php` [[phab:T404363|T404363]]
* 19:51 Reedy: running `foreachwikiindblist private.dblist extensions/OATHAuth/maintenance/UpdateSecretsToEncryptedFormat.php` [[phab:T404363|T404363]]
* 19:50 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns5004.wikimedia.org
* 19:50 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
* 19:50 Reedy: running `mwscript extensions/OATHAuth/maintenance/UpdateSecretsToEncryptedFormat.php --wiki=metawiki` [[phab:T404363|T404363]]
* 19:49 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
* 19:49 reedy@deploy2002: Synchronized private/PrivateSettings.php: Set $wgOATHSecretKey [[phab:T404363|T404363]] (duration: 05m 51s)
* 19:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
* 19:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5030.eqsin.wmnet with OS trixie
* 19:42 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
* 19:39 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5017.eqsin.wmnet with OS trixie
* 19:35 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 19:35 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 19:33 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns5004.wikimedia.org
* 19:30 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install4004.wikimedia.org with OS bookworm
* 19:29 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS trixie
* 19:28 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5020.eqsin.wmnet [reason: trixie reimaging]
* 19:28 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5018.eqsin.wmnet [reason: trixie reimaging]
* 19:27 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 19:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 19:26 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5018.eqsin.wmnet with OS trixie
* 19:23 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:23 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:18 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns5003.wikimedia.org
* 19:17 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:13 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:13 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install4004.wikimedia.org with reason: host reimage
* 19:11 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
* 19:08 brett@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp5029.eqsin.wmnet
* 19:08 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
* 19:08 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS trixie
* 19:08 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on install4004.wikimedia.org with reason: host reimage
* 19:02 brett@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp5029.eqsin.wmnet
* 19:01 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns5003.wikimedia.org
* 18:56 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5031.*
* 18:56 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5031.eqsin.wmnet with OS trixie
* 18:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 18:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 18:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5030.eqsin.wmnet with OS trixie
* 18:46 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns4004.wikimedia.org
* 18:45 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host install4004.wikimedia.org with OS bookworm
* 18:45 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 18:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
* 18:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
* 18:29 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
* 18:27 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns4004.wikimedia.org
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
* 18:18 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5017.eqsin.wmnet with OS trixie
* 18:17 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
* 18:17 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5017.eqsin.wmnet [reason: trixie reimaging]
* 18:17 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5018.eqsin.wmnet with OS trixie
* 18:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5018.eqsin.wmnet [reason: trixie reimaging]
* 18:16 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5017.eqsin.wmnet [reason: trixie reimaging]
* 18:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5017.eqsin.wmnet [reason: trixie reimaging]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3077.esams.wmnet [reason: trixie reimaging]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 18:12 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 18:12 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns4003.wikimedia.org
* 18:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1258: Ready
* 18:07 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 18:01 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 17:59 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3078.esams.wmnet with OS trixie
* 17:56 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3077.esams.wmnet with OS trixie
* 17:55 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns4003.wikimedia.org
* 17:54 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
* 17:51 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
* 17:43 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:42 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:40 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns3004.wikimedia.org
* 17:39 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:38 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backupmon1001.eqiad.wmnet with reason: upgrade
* 17:35 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1347.eqiad.wmnet with OS trixie
* 17:33 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 17:32 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 17:32 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5031.eqsin.wmnet with OS trixie
* 17:32 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5032.eqsin.wmnet with OS trixie
* 17:32 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5031.eqsin.wmnet with OS trixie
* 17:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 17:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 17:30 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3077.esams.wmnet with reason: host reimage
* 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:29 claime: rearmed keyholder on deploy1003
* 17:28 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:27 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 17:26 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3077.esams.wmnet with reason: host reimage
* 17:25 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:25 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1258: Ready
* 17:23 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns3004.wikimedia.org
* 17:21 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 17:21 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 17:21 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:20 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:20 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-esams and A:ncredir
* 17:19 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1347.eqiad.wmnet with reason: host reimage
* 17:18 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-drmrs and A:ncredir
* 17:16 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-eqiad and A:ncredir
* 17:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-ulsfo and A:ncredir
* 17:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
* 17:15 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 17:14 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1347.eqiad.wmnet with reason: host reimage
* 17:14 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5031.eqsin.wmnet with OS trixie
* 17:12 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 17:12 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
* 17:11 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 17:09 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 17:09 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3078.*
* 17:08 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 17:08 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 17:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3079.*
* 17:08 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns3003.wikimedia.org
* 17:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3078.*
* 17:07 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-eqiad and A:ncredir
* 17:07 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
* 17:07 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-esams and A:ncredir
* 17:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
* 17:06 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir2002.*
* 17:05 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-drmrs and A:ncredir
* 17:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir2002.codfw.wmnet
* 17:05 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-eqsin and A:ncredir
* 17:05 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-ulsfo and A:ncredir
* 17:04 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-magru and A:ncredir
* 17:03 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3078.esams.wmnet with OS trixie
* 17:02 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1347
* 17:02 jayme@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1347
* 17:02 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 17:01 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3076.esams.wmnet [reason: trixie reimaging]
* 17:01 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3077.esams.wmnet with OS trixie
* 17:01 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3077.esams.wmnet [reason: trixie reimaging]
* 16:59 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncredir2002.codfw.wmnet
* 16:58 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir2002.*
* 16:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 8 hosts with reason: upgrade
* 16:55 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir2001.*
* 16:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ncredir2001.codfw.wmnet
* 16:55 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for ncredir2001.codfw.wmnet
* 16:55 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3076.esams.wmnet with OS trixie
* 16:53 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2014.codfw.wmnet
* 16:52 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-eqsin and A:ncredir
* 16:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2008.codfw.wmnet with reason: kernel update
* 16:51 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir
* 16:51 klausman@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on ml-serve1013.eqiad.wmnet with reason: Reboot for security update
* 16:50 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2013.codfw.wmnet
* 16:49 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir2001.*
* 16:49 brett@cumin2002: END (ERROR) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=97) rolling reboot on A:ncredir and A:ncredir
* 16:48 jayme@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1347
* 16:48 jayme@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1347.eqiad.wmnet 199.48.64.10.in-addr.arpa 9.9.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 16:48 jayme@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1347.eqiad.wmnet 199.48.64.10.in-addr.arpa 9.9.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 16:48 jayme@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:47 jayme@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1347 - jayme@cumin1003"
* 16:47 jayme@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1347 - jayme@cumin1003"
* 16:47 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns3003.wikimedia.org
* 16:47 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1012.eqiad.wmnet
* 16:47 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and A:ncredir
* 16:47 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2012.codfw.wmnet
* 16:47 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2014.codfw.wmnet
* 16:46 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3075.esams.wmnet [reason: trixie reimaging]
* 16:46 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2003.codfw.wmnet
* 16:45 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
* 16:44 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2013.codfw.wmnet
* 16:44 jayme@cumin1003: START - Cookbook sre.dns.netbox
* 16:43 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2009.codfw.wmnet
* 16:43 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1347
* 16:43 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
* 16:43 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1347.eqiad.wmnet with OS trixie
* 16:43 brett@cumin2002: START - Cookbook sre.hosts.reboot-cluster
* 16:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2007.codfw.wmnet with reason: kernel update
* 16:40 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2012.codfw.wmnet
* 16:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3079.esams.wmnet with OS trixie
* 16:39 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2008.codfw.wmnet
* 16:38 moritzm: installing PHP 8.2 security updates
* 16:37 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2009.codfw.wmnet
* 16:36 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
* 16:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3078.esams.wmnet with OS trixie
* 16:34 moritzm: installing alsa-lib security updates
* 16:33 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
* 16:32 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2008.codfw.wmnet
* 16:32 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns2006.wikimedia.org
* 16:29 moritzm: failover Ganeti master in eqiad to ganeti1046
* 16:29 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3076.esams.wmnet with reason: host reimage
* 16:29 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2003.codfw.wmnet
* 16:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
* 16:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
* 16:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2005.codfw.wmnet with reason: kernel update
* 16:24 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3076.esams.wmnet with reason: host reimage
* 16:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
* 16:22 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1012.eqiad.wmnet
* 16:20 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1013.eqiad.wmnet
* 16:19 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1011.eqiad.wmnet
* 16:18 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
* 16:18 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 16:16 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host install4004.wikimedia.org with OS bookworm
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
* 16:14 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns2006.wikimedia.org
* 16:14 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1013.eqiad.wmnet
* 16:14 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1009.eqiad.wmnet
* 16:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3079.esams.wmnet with reason: host reimage
* 16:13 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1011.eqiad.wmnet
* 16:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1029.eqiad.wmnet with reason: kernel update
* 16:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1016.eqiad.wmnet with OS bookworm
* 16:12 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 16:11 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 16:11 moritzm: powercycling ganeti1053 (stuck on reboot)
* 16:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 16:09 klausman@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad
* 16:09 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 16:08 klausman@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad
* 16:07 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1009.eqiad.wmnet
* 16:07 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1003.eqiad.wmnet
* 16:06 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3079.esams.wmnet with reason: host reimage
* 16:06 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 16:06 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 16:04 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 16:04 klausman@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad
* 16:04 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 16:02 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 16:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1028.eqiad.wmnet with reason: kernel update
* 16:00 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1003.eqiad.wmnet
* 16:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1015.eqiad.wmnet
* 16:00 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3075.esams.wmnet with OS trixie
* 16:00 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3076.esams.wmnet with OS trixie
* 15:59 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns2005.wikimedia.org
* 15:58 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3076.esams.wmnet [reason: trixie reimaging]
* 15:58 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1012.eqiad.wmnet
* 15:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
* 15:58 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 15:57 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 15:57 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1010.eqiad.wmnet
* 15:57 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1008.eqiad.wmnet
* 15:57 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3074.esams.wmnet [reason: trixie reimaging]
* 15:56 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 15:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
* 15:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
* 15:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1017.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1023.eqiad.wmnet with reason: kernel update
* 15:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1015.eqiad.wmnet
* 15:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy1022.eqiad.wmnet
* 15:54 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1008.eqiad.wmnet
* 15:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy1022.eqiad.wmnet
* 15:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
* 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
* 15:52 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1010.eqiad.wmnet
* 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
* 15:51 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1012.eqiad.wmnet
* 15:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3074.esams.wmnet with OS trixie
* 15:49 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and not P<nowiki>{</nowiki>cp2042.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 15:48 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1014.eqiad.wmnet
* 15:48 klausman@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on A:ml-serve-worker-eqiad
* 15:47 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy1003.eqiad.wmnet
* 15:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
* 15:46 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and not P<nowiki>{</nowiki>cp2041.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 15:45 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1017.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:45 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns2005.wikimedia.org
* 15:42 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1014.eqiad.wmnet
* 15:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
* 15:41 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3079.esams.wmnet with OS trixie
* 15:41 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3078.esams.wmnet with OS trixie
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
* 15:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1022.eqiad.wmnet with reason: kernel update
* 15:38 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Update for dse-k8s-worker1016 - btullis@cumin1003"
* 15:37 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Update for dse-k8s-worker1016 - btullis@cumin1003"
* 15:37 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy1003.eqiad.wmnet
* 15:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1027.eqiad.wmnet
* 15:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1016.eqiad.wmnet with reason: host reimage
* 15:35 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-eqiad
* 15:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
* 15:34 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3075.esams.wmnet with reason: host reimage
* 15:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1372.eqiad.wmnet
* 15:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1371.eqiad.wmnet
* 15:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1370.eqiad.wmnet
* 15:31 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1027.eqiad.wmnet
* 15:30 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns2004.wikimedia.org
* 15:29 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1016.eqiad.wmnet with reason: host reimage
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1369.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1368.eqiad.wmnet
* 15:28 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1372.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1367.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1366.eqiad.wmnet
* 15:28 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1371.eqiad.wmnet
* 15:28 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1370.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1365.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1364.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1363.eqiad.wmnet
* 15:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1362.eqiad.wmnet
* 15:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1361.eqiad.wmnet
* 15:27 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1017
* 15:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1360.eqiad.wmnet
* 15:26 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1017
* 15:25 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3074.esams.wmnet with reason: host reimage
* 15:25 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install4004.wikimedia.org with OS bookworm
* 15:25 sukhe@dns1004: END - running authdns-update
* 15:24 sukhe@dns1004: START - running authdns-update
* 15:24 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install4004.wikimedia.org
* 15:24 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host install4004.wikimedia.org with OS bookworm
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1369.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1368.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1367.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1366.eqiad.wmnet
* 15:23 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1365.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1364.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1363.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1362.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1361.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1360.eqiad.wmnet
* 15:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1349.eqiad.wmnet
* 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
* 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
* 15:18 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3075.esams.wmnet with reason: host reimage
* 15:18 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3074.esams.wmnet with reason: host reimage
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1348.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1346.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1344.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1345.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1343.eqiad.wmnet
* 15:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1342.eqiad.wmnet
* 15:16 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns2004.wikimedia.org
* 15:15 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1349.eqiad.wmnet
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1341.eqiad.wmnet
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1340.eqiad.wmnet
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1339.eqiad.wmnet
* 15:15 moritzm: imported jenkins 2.541.3 for bullseye/bookworm/trixie
* 15:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1338.eqiad.wmnet
* 15:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1016.eqiad.wmnet with OS bookworm
* 15:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1348.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1346.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1336.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1337.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1345.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1344.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1334.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1335.eqiad.wmnet
* 15:11 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1343.eqiad.wmnet
* 15:11 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1342.eqiad.wmnet
* 15:11 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1332.eqiad.wmnet
* 15:11 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1333.eqiad.wmnet
* 15:11 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 15:11 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 15:10 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1341.eqiad.wmnet
* 15:10 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1340.eqiad.wmnet
* 15:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1331.eqiad.wmnet
* 15:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1330.eqiad.wmnet
* 15:10 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1339.eqiad.wmnet
* 15:09 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1329.eqiad.wmnet
* 15:09 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1338.eqiad.wmnet
* 15:09 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1328.eqiad.wmnet
* 15:09 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
* 15:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
* 15:07 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1337.eqiad.wmnet
* 15:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
* 15:07 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1336.eqiad.wmnet
* 15:07 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1335.eqiad.wmnet
* 15:06 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1334.eqiad.wmnet
* 15:06 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1333.eqiad.wmnet
* 15:06 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1332.eqiad.wmnet
* 15:05 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1331.eqiad.wmnet
* 15:05 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1330.eqiad.wmnet
* 15:04 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1329.eqiad.wmnet
* 15:04 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1328.eqiad.wmnet
* 15:03 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 15:02 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1033.eqiad.wmnet with OS trixie
* 15:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
* 15:01 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns1006.wikimedia.org
* 14:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum4002.ulsfo.wmnet
* 14:58 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:57 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 14:54 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3075.esams.wmnet with OS trixie
* 14:54 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3075.esams.wmnet [reason: trixie reimaging]
* 14:54 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3074.esams.wmnet with OS trixie
* 14:53 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3074.esams.wmnet [reason: trixie reimaging]
* 14:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
* 14:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
* 14:50 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:49 slyngshede@dns1004: END - running authdns-update
* 14:48 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:48 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:48 slyngshede@dns1004: START - running authdns-update
* 14:47 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns1006.wikimedia.org
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum4002.ulsfo.wmnet
* 14:45 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-eqiad
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum4001.ulsfo.wmnet
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:44 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
* 14:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
* 14:40 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:38 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
* 14:36 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum4001.ulsfo.wmnet
* 14:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install4004.wikimedia.org with OS bookworm
* 14:33 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
* 14:32 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns1005.wikimedia.org
* 14:32 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "inline pattern and pattern equivalence - oblivian@cumin1003"
* 14:32 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: inline pattern and pattern equivalence - oblivian@cumin1003
* 14:31 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: inline pattern and pattern equivalence - oblivian@cumin1003
* 14:31 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "inline pattern and pattern equivalence - oblivian@cumin1003"
* 14:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
* 14:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install4004.wikimedia.org - jmm@cumin2002"
* 14:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install4004.wikimedia.org - jmm@cumin2002"
* 14:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install4004.wikimedia.org on all recursors
* 14:24 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install4004.wikimedia.org on all recursors
* 14:21 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install4004.wikimedia.org on all recursors
* 14:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install4004.wikimedia.org on all recursors
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install4004.wikimedia.org - jmm@cumin2002"
* 14:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install4004.wikimedia.org - jmm@cumin2002"
* 14:19 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns1005.wikimedia.org
* 14:17 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254911{{!}}Restore quotation-marks in ext.wikilambda.app messages (T420456)]] (duration: 06m 32s)
* 14:17 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:17 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS bookworm
* 14:17 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:16 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:16 apine@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install4004.wikimedia.org
* 14:15 apine@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:15 apine@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:14 apine@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:14 apine@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:14 jforrester@deploy2002: jforrester: Continuing with sync
* 14:13 apine@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:13 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:13 jforrester@deploy2002: jforrester: Backport for [[gerrit:1254911{{!}}Restore quotation-marks in ext.wikilambda.app messages (T420456)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:13 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:11 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1254911{{!}}Restore quotation-marks in ext.wikilambda.app messages (T420456)]]
* 14:08 apine@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 apine@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 apine@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:06 apine@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:05 XioNoX: set graceful-shutdown on EdgeUno transit sessions
* 14:05 apine@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:04 apine@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:04 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns1004.wikimedia.org
* 14:02 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
* 14:01 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 14:01 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 13:57 Msz2001: UTC afternoon backport+config window done
* 13:56 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254876{{!}}Tweak configuration of external link aggregate usage analysis (T419837)]] (duration: 06m 41s)
* 13:55 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:53 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 13:52 mszwarc@deploy2002: mszwarc: Continuing with sync
* 13:51 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1254876{{!}}Tweak configuration of external link aggregate usage analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:50 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns1004.wikimedia.org
* 13:50 sukhe@cumin1003: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox
* 13:49 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1254876{{!}}Tweak configuration of external link aggregate usage analysis (T419837)]]
* 13:49 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254916{{!}}Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917{{!}}Normalize external domain names in click analysis (T419837)]] (duration: 07m 23s)
* 13:46 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 13:45 mszwarc@deploy2002: mszwarc: Continuing with sync
* 13:43 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1254916{{!}}Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917{{!}}Normalize external domain names in click analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2096.codfw.wmnet
* 13:41 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1033.eqiad.wmnet with OS trixie
* 13:41 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS bookworm
* 13:41 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1254916{{!}}Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917{{!}}Normalize external domain names in click analysis (T419837)]]
* 13:40 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248095{{!}}filebackend: Remove outdated comment]], [[gerrit:1254216{{!}}GrowthExperiments: increase edit and thanks query limit II (T341599)]] (duration: 08m 47s)
* 13:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
* 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
* 13:39 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 13:39 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet
* 13:37 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2096.codfw.wmnet
* 13:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2095.codfw.wmnet
* 13:36 sgimeno@deploy2002: matmarex, sgimeno: Continuing with sync
* 13:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
* 13:33 sgimeno@deploy2002: matmarex, sgimeno: Backport for [[gerrit:1248095{{!}}filebackend: Remove outdated comment]], [[gerrit:1254216{{!}}GrowthExperiments: increase edit and thanks query limit II (T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:31 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet
* 13:31 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1095.eqiad.wmnet
* 13:31 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1248095{{!}}filebackend: Remove outdated comment]], [[gerrit:1254216{{!}}GrowthExperiments: increase edit and thanks query limit II (T341599)]]
* 13:30 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2095.codfw.wmnet
* 13:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2094.codfw.wmnet
* {{safesubst:SAL entry|1=13:28 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254894{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254895{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254891{{!}}Revert "SpecialPreferences: Use Language Select Widget in language field" (T419895)]], [[gerrit:1254890{{!}}Revert "SpecialPreferences: Use Language Select Widget in lan}}
* 13:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 13:28 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 13:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1026.eqiad.wmnet
* 13:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
* 13:26 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 13:25 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1095.eqiad.wmnet
* 13:25 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1094.eqiad.wmnet
* 13:24 sgimeno@deploy2002: somerandomdeveloper, sgimeno: Continuing with sync
* {{safesubst:SAL entry|1=13:24 sgimeno@deploy2002: somerandomdeveloper, sgimeno: Backport for [[gerrit:1254894{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254895{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254891{{!}}Revert "SpecialPreferences: Use Language Select Widget in language field" (T419895)]], [[gerrit:1254890{{!}}Revert "SpecialPreferences: Use Language Select Widget in}}
* 13:23 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2094.codfw.wmnet
* 13:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2093.codfw.wmnet
* {{safesubst:SAL entry|1=13:22 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1254894{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254895{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254891{{!}}Revert "SpecialPreferences: Use Language Select Widget in language field" (T419895)]], [[gerrit:1254890{{!}}Revert "SpecialPreferences: Use Language Select Widget in lang}}
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
* 13:21 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1026.eqiad.wmnet
* 13:20 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:16 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2093.codfw.wmnet
* 13:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2092.codfw.wmnet
* 13:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
* 13:16 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1094.eqiad.wmnet
* 13:16 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1093.eqiad.wmnet
* 13:15 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:15 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:15 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
* 13:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:10 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:09 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1093.eqiad.wmnet
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
* 13:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2092.codfw.wmnet
* 13:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1027.eqiad.wmnet with reason: host reimage
* 13:08 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1016
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:06 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1016
* 13:06 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:04 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1027.eqiad.wmnet with reason: host reimage
* 13:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
* 13:02 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1012.eqiad.wmnet
* 12:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet with OS bookworm
* 12:58 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 12:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
* 12:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1015.eqiad.wmnet
* 12:55 ayounsi@dns1004: END - running authdns-update
* 12:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1012.eqiad.wmnet
* 12:54 ayounsi@dns1004: START - running authdns-update
* 12:53 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1012.eqiad.wmnet
* 12:53 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 12:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1015.eqiad.wmnet
* 12:50 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 12:50 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
* 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
* 12:47 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-codfw
* 12:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
* 12:42 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-jumbo-eqiad
* 12:38 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:37 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:37 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:36 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 12:35 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1372].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 12:33 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:32 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:31 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:30 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1020.eqiad.wmnet with reason: host reimage
* 12:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
* 12:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
* 12:25 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1020.eqiad.wmnet with reason: host reimage
* 12:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
* 12:25 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 12:25 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 12:25 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:24 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Update for dse-k8s-worker1015 - btullis@cumin1003"
* 12:24 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Update for dse-k8s-worker1015 - btullis@cumin1003"
* 12:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 12:22 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:21 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
* 12:19 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:19 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host serpens.wikimedia.org
* 12:13 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254851{{!}}Enable autodemotion for 2FA-less CN admins and WMF T&S (T418580)]] (duration: 06m 21s)
* 12:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host serpens.wikimedia.org
* 12:10 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
* 12:10 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
* 12:09 mszwarc@deploy2002: mszwarc: Continuing with sync
* 12:09 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1254851{{!}}Enable autodemotion for 2FA-less CN admins and WMF T&S (T418580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:07 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1254851{{!}}Enable autodemotion for 2FA-less CN admins and WMF T&S (T418580)]]
* 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
* 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
* 12:05 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:05 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 12:04 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:03 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:02 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254883{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254884{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]], [[gerrit:1254882{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254881{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]] (duration: 06m 48s)
* 12:02 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1026.eqiad.wmnet with reason: host reimage
* 12:02 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
* 12:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
* 12:01 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
* 11:59 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1026.eqiad.wmnet with reason: host reimage
* 11:58 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 11:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1012.eqiad.wmnet
* 11:57 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1254883{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254884{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]], [[gerrit:1254882{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254881{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]] synced to the testservers (see https://wikitech.wikimedia.
* 11:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 11:56 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1372].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 11:56 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-codfw
* 11:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:55 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1254883{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254884{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]], [[gerrit:1254882{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254881{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]]
* 11:54 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:54 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
* 11:50 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:50 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Updating for dse-k8s-worker1012 - btullis@cumin1003"
* 11:49 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Updating for dse-k8s-worker1012 - btullis@cumin1003"
* 11:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1015.eqiad.wmnet with reason: host reimage
* 11:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
* 11:48 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
* 11:48 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1307.eqiad.wmnet
* 11:48 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1307.eqiad.wmnet
* 11:47 claime: sudo homer lsw1-e5-eqiad* commit 'wikikube-worker1307 to active'
* 11:47 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:46 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1015.eqiad.wmnet with reason: host reimage
* 11:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 11:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 11:44 cgoubert@cumin1003: START - Cookbook sre.dns.netbox
* 11:42 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 11:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
* 11:39 cgoubert@cumin1003: START - Cookbook sre.dns.netbox
* 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
* 11:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2091.codfw.wmnet
* 11:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
* 11:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
* 11:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
* 11:36 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1347.eqiad.wmnet
* 11:34 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1020.eqiad.wmnet with OS bookworm
* 11:31 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 11:31 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2091.codfw.wmnet
* 11:30 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1092.eqiad.wmnet
* 11:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2090.codfw.wmnet
* 11:30 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1347.eqiad.wmnet
* 11:30 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
* 11:30 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 11:30 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 11:29 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 11:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
* 11:29 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 11:28 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 11:28 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
* 11:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
* 11:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2090.codfw.wmnet
* 11:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2089.codfw.wmnet
* 11:23 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1092.eqiad.wmnet
* 11:23 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1091.eqiad.wmnet
* 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
* 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
* 11:20 btullis@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host dse-k8s-worker1015
* 11:20 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1015
* 11:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1091.eqiad.wmnet
* 11:18 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1090.eqiad.wmnet
* 11:18 vgutierrez@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 11:18 vgutierrez@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping2004.codfw.wmnet
* 11:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 11:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2088.codfw.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
* 11:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping2004.codfw.wmnet
* 11:13 vgutierrez@dns1004: END - running authdns-update
* 11:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
* 11:11 vgutierrez@dns1004: START - running authdns-update
* 11:11 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1090.eqiad.wmnet
* 11:11 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1089.eqiad.wmnet
* 11:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2088.codfw.wmnet
* 11:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2087.codfw.wmnet
* 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
* 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
* 11:07 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 11:05 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:05 btullis@cumin1003: END (PASS) - Cookbook sre.hadoop.reboot-workers (exit_code=0) for Hadoop test cluster
* 11:04 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1089.eqiad.wmnet
* 11:04 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:04 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1088.eqiad.wmnet
* 11:03 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2087.codfw.wmnet
* 11:03 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2086.codfw.wmnet
* 11:03 vgutierrez@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
* 11:00 vgutierrez@cumin1003: START - Cookbook sre.dns.netbox
* 10:59 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-jumbo-eqiad
* 10:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
* 10:57 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1088.eqiad.wmnet
* 10:57 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1087.eqiad.wmnet
* 10:57 fabfur@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
* 10:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2086.codfw.wmnet
* 10:56 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2085.codfw.wmnet
* 10:56 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 10:53 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1087.eqiad.wmnet
* 10:53 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1086.eqiad.wmnet
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 10:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
* 10:49 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2085.codfw.wmnet
* 10:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 10:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2084.codfw.wmnet
* 10:46 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1086.eqiad.wmnet
* 10:46 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1085.eqiad.wmnet
* 10:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 10:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
* 10:40 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2084.codfw.wmnet
* 10:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2083.codfw.wmnet
* 10:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 10:39 fabfur@cumin1003: START - Cookbook sre.dns.netbox
* 10:39 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1085.eqiad.wmnet
* 10:39 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1084.eqiad.wmnet
* 10:37 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2001.codfw.wmnet
* 10:34 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:34 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 10:32 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1084.eqiad.wmnet
* 10:32 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1083.eqiad.wmnet
* 10:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2083.codfw.wmnet
* 10:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
* 10:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2082.codfw.wmnet
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 10:32 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
* 10:31 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2002.codfw.wmnet
* 10:26 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-staging2002.codfw.wmnet
* 10:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 10:25 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1083.eqiad.wmnet
* 10:25 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1082.eqiad.wmnet
* 10:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2082.codfw.wmnet
* 10:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2081.codfw.wmnet
* 10:24 btullis@cumin1003: START - Cookbook sre.hadoop.reboot-workers for Hadoop test cluster
* 10:23 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2003.codfw.wmnet
* 10:23 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
* 10:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 10:19 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-staging2003.codfw.wmnet
* 10:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1082.eqiad.wmnet
* 10:17 vgutierrez@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: no reason specified, no task ID specified]
* 10:17 vgutierrez@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: no reason specified, no task ID specified]
* 10:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2081.codfw.wmnet
* 10:17 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2001.codfw.wmnet
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 10:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
* 10:14 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2001.codfw.wmnet
* 10:14 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2002.codfw.wmnet
* 10:13 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1081.eqiad.wmnet
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 10:11 vgutierrez@dns1004: END - running authdns-update
* 10:10 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2002.codfw.wmnet
* 10:10 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2001.codfw.wmnet
* 10:09 vgutierrez@dns1004: START - running authdns-update
* 10:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2080.codfw.wmnet
* 10:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 10:06 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1081.eqiad.wmnet
* 10:06 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1080.eqiad.wmnet
* 10:05 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2001.codfw.wmnet
* 10:05 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2002.codfw.wmnet
* 10:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 10:04 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 10:03 slyngshede@cumin1003: END (FAIL) - Cookbook sre.dns.admin (exit_code=99) DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 10:03 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 10:01 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2002.codfw.wmnet
* 10:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 23 hosts
* 10:01 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
* 10:01 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1002.eqiad.wmnet
* 10:01 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for 23 hosts
* 09:59 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1080.eqiad.wmnet
* 09:59 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1079.eqiad.wmnet
* 09:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2080.codfw.wmnet
* 09:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2079.codfw.wmnet
* 09:58 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1002.eqiad.wmnet
* 09:57 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:56 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
* 09:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 09:52 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:51 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2079.codfw.wmnet
* 09:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2078.codfw.wmnet
* 09:51 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1079.eqiad.wmnet
* 09:51 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1003.eqiad.wmnet
* 09:51 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1078.eqiad.wmnet
* 09:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
* 09:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 09:48 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1003.eqiad.wmnet
* 09:48 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1002.eqiad.wmnet
* 09:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
* 09:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
* 09:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 09:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1026.eqiad.wmnet
* 09:46 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1002.eqiad.wmnet
* 09:46 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1001.eqiad.wmnet
* 09:45 moritzm: installing postgresql-15 security updates
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart A:lvs-secondary-ulsfo and A:liberica
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) pooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2078.codfw.wmnet
* 09:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2077.codfw.wmnet
* 09:45 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin pooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
* 09:44 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.upgrade restart A:lvs-secondary-ulsfo and A:liberica
* 09:44 jayme: switched wikikube staging apiservers to IPIP and maglev in eqiad and codfw - [[phab:T352956|T352956]]
* 09:43 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1078.eqiad.wmnet
* 09:43 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1077.eqiad.wmnet
* 09:42 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1001.eqiad.wmnet
* 09:40 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-staging-master-eqiad@eqiad
* 09:40 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:40 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading A:lvs-secondary-ulsfo and A:liberica ([[phab:T418971|T418971]])
* 09:40 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading A:lvs-secondary-ulsfo and A:liberica ([[phab:T418971|T418971]])
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1026.eqiad.wmnet
* 09:39 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2077.codfw.wmnet
* 09:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2076.codfw.wmnet
* 09:37 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-codfw
* 09:37 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-staging-master-eqiad@eqiad
* 09:36 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1077.eqiad.wmnet
* 09:36 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1076.eqiad.wmnet
* 09:35 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-staging-master-codfw@codfw
* 09:35 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 09:30 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2076.codfw.wmnet
* 09:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2075.codfw.wmnet
* 09:26 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1076.eqiad.wmnet
* 09:26 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1075.eqiad.wmnet
* 09:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 09:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1025.eqiad.wmnet
* 09:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2075.codfw.wmnet
* 09:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2074.codfw.wmnet
* 09:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 09:19 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1075.eqiad.wmnet
* 09:18 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1074.eqiad.wmnet
* 09:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 09:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1025.eqiad.wmnet
* 09:15 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2074.codfw.wmnet
* 09:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2073.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-staging-master-codfw@codfw
* 09:14 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 09:13 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-codfw
* 09:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 09:12 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-eqiad
* 09:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
* 09:10 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1074.eqiad.wmnet
* 09:10 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1073.eqiad.wmnet
* 09:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2073.codfw.wmnet
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet
* 09:08 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 23 hosts with reason: Update ULSFO LVS service IPs
* 09:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 09:03 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1073.eqiad.wmnet
* 09:03 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1072.eqiad.wmnet
* 09:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet
* 09:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2071.codfw.wmnet
* 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 09:02 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping1004.eqiad.wmnet
* 09:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 08:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
* 08:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 08:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping1004.eqiad.wmnet
* 08:56 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1072.eqiad.wmnet
* 08:56 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet
* 08:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2071.codfw.wmnet
* 08:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2070.codfw.wmnet
* 08:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 08:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
* 08:48 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet
* 08:47 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2070.codfw.wmnet
* 08:46 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-eqiad
* 08:29 hashar: Restarting CI Jenkins for plugin upgrade # [[phab:T420347|T420347]]
* 08:22 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 07:45 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 32934
* 07:42 btullis@cumin1003: END (PASS) - Cookbook sre.hadoop.reboot-workers (exit_code=0) for Hadoop analytics cluster
* 07:35 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 32934
* 07:22 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
* 07:16 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
* 06:54 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
* 06:38 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
* 03:22 musikanimal@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254468{{!}}CM5: add more aggressive warnings about CM5 deprecation (T373720)]] (duration: 12m 22s)
* 03:18 musikanimal@deploy2002: musikanimal: Continuing with sync
* 03:11 musikanimal@deploy2002: musikanimal: Backport for [[gerrit:1254468{{!}}CM5: add more aggressive warnings about CM5 deprecation (T373720)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 03:09 musikanimal@deploy2002: Started scap sync-world: Backport for [[gerrit:1254468{{!}}CM5: add more aggressive warnings about CM5 deprecation (T373720)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 47s)
* 02:07 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 02:06 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:04 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:38 denisse@deploy2002: Finished deploy [librenms/librenms@9bdfb73]: Upgrade LibreNMS to 26.3.1 (duration: 00m 19s)
* 01:38 denisse@deploy2002: Started deploy [librenms/librenms@9bdfb73]: Upgrade LibreNMS to 26.3.1
* 01:10 denisse@deploy2002: Finished deploy [librenms/librenms@d152b36]: Upgrade LibreNMS to 25.11.0 (duration: 00m 08s)
* 01:10 denisse@deploy2002: Started deploy [librenms/librenms@d152b36]: Upgrade LibreNMS to 25.11.0
== 2026-03-17 ==
* 23:44 btullis@cumin1003: START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster
* 23:38 btullis@cumin1003: END (FAIL) - Cookbook sre.hadoop.reboot-workers (exit_code=99) for Hadoop analytics cluster
* 22:55 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3081.*
* 22:20 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3073.esams.wmnet [reason: trixie reimaging]
* 22:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3073.esams.wmnet with OS trixie
* 22:11 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3072.esams.wmnet [reason: trixie reimaging]
* 22:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3072.esams.wmnet with OS trixie
* 22:05 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on releases1003.eqiad.wmnet with reason: [[phab:T420246|T420246]]
* 22:05 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T420246|T420246]]
* 21:48 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3073.esams.wmnet with reason: host reimage
* 21:44 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3072.esams.wmnet with reason: host reimage
* 21:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3073.esams.wmnet with reason: host reimage
* 21:39 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3072.esams.wmnet with reason: host reimage
* 21:38 ryankemper: [[phab:T411568|T411568]] Failed back HDFS NameNode from an-master1004 to an-master1003; cluster back to original active/standby configuration
* 21:15 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3073.esams.wmnet with OS trixie
* 21:14 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3073.esams.wmnet [reason: trixie reimaging]
* 21:14 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3072.esams.wmnet with OS trixie
* 21:14 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3072.esams.wmnet [reason: trixie reimaging]
* 21:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3070.esams.wmnet [reason: trixie reimaging]
* 21:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3071.esams.wmnet [reason: trixie reimaging]
* 21:09 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3070.esams.wmnet with OS trixie
* 21:05 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3071.esams.wmnet with OS trixie
* 20:59 alexsanford@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254280{{!}}Remove notice from login form in popup mode (T418534)]] (duration: 07m 32s)
* 20:56 alexsanford@deploy2002: alexsanford: Continuing with sync
* 20:54 alexsanford@deploy2002: alexsanford: Backport for [[gerrit:1254280{{!}}Remove notice from login form in popup mode (T418534)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 alexsanford@deploy2002: Started scap sync-world: Backport for [[gerrit:1254280{{!}}Remove notice from login form in popup mode (T418534)]]
* 20:48 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:43 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:43 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3070.esams.wmnet with reason: host reimage
* 20:40 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 20:40 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 20:38 ryankemper: [[phab:T411568|T411568]] failed over HDFS NameNode from an-master1003 to an-master1004, then rebooted `an-master1003`
* 20:38 ryankemper: [[phab:T411568|T411568]] rebooted `an-coord1003`, `an-coord1004`, `an-tool1007`, `an-tool1008`, `an-tool1011`, `an-web1001`
* 20:38 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3071.esams.wmnet with reason: host reimage
* 20:34 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3070.esams.wmnet with reason: host reimage
* 20:34 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3071.esams.wmnet with reason: host reimage
* 20:31 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254301{{!}}Passwordless login: Don't display conditional auth errors]], [[gerrit:1254302{{!}}Passwordless login: Don't display conditional auth errors]] (duration: 08m 56s)
* 20:27 catrope@deploy2002: catrope: Continuing with sync
* 20:24 catrope@deploy2002: catrope: Backport for [[gerrit:1254301{{!}}Passwordless login: Don't display conditional auth errors]], [[gerrit:1254302{{!}}Passwordless login: Don't display conditional auth errors]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:22 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1254301{{!}}Passwordless login: Don't display conditional auth errors]], [[gerrit:1254302{{!}}Passwordless login: Don't display conditional auth errors]]
* 20:16 ryankemper: [[phab:T411568|T411568]] rebooted `an-test-master1002`, `an-test-master1003`, `an-test-master1004`, `archiva1002`
* 20:12 aude@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251309{{!}}Set wgReadingListsBetaDefaultForNewAccountsAfter for beta cluster (T419163)]] (duration: 08m 53s)
* 20:09 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3071.esams.wmnet with OS trixie
* 20:09 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3070.esams.wmnet with OS trixie
* 20:08 aude@deploy2002: aude: Continuing with sync
* 20:08 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3070.esams.wmnet [reason: trixie reimaging]
* 20:08 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3068.esams.wmnet [reason: trixie reimaging]
* 20:07 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3069.esams.wmnet [reason: trixie reimaging]
* 20:06 aude@deploy2002: aude: Backport for [[gerrit:1251309{{!}}Set wgReadingListsBetaDefaultForNewAccountsAfter for beta cluster (T419163)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:03 aude@deploy2002: Started scap sync-world: Backport for [[gerrit:1251309{{!}}Set wgReadingListsBetaDefaultForNewAccountsAfter for beta cluster (T419163)]]
* 19:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3081.esams.wmnet with OS trixie
* 19:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3069.esams.wmnet with OS trixie
* 19:54 ryankemper: [[phab:T411568|T411568]] rebooted `an-test-client1002`, `an-test-ui1001`, `an-test-coord1001`, `an-test-master1001`
* 19:50 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3068.esams.wmnet with OS trixie
* 19:46 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 19:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast2003.wikimedia.org with OS trixie
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3081.esams.wmnet with reason: host reimage
* 19:28 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3081.esams.wmnet with reason: host reimage
* 19:28 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases1003.eqiad.wmnet with reason: [[phab:T420246|T420246]]
* 19:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3069.esams.wmnet with reason: host reimage
* 19:23 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3068.esams.wmnet with reason: host reimage
* 19:21 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3069.esams.wmnet with reason: host reimage
* 19:20 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3068.esams.wmnet with reason: host reimage
* 19:19 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 19:11 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 19:08 dzahn@dns1004: END - running authdns-update
* 19:07 dzahn@dns1004: START - running authdns-update
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3081.esams.wmnet with OS trixie
* 19:05 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp3081.esams.wmnet with OS trixie
* 19:00 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3080.*
* 18:56 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3069.esams.wmnet with OS trixie
* 18:55 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3069.esams.wmnet [reason: trixie reimaging]
* 18:55 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3068.esams.wmnet with OS trixie
* 18:55 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid public cluster: Reboot Druid nodes
* 18:54 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3068.esams.wmnet [reason: trixie reimaging]
* 18:53 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3066.esams.wmnet [reason: trixie reimaging]
* 18:53 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3067.esams.wmnet [reason: trixie reimaging]
* 18:50 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host bast2003.wikimedia.org with OS trixie
* 18:49 swfrench-wmf: manually uncordoned wikikube-worker-exp1001.eqiad.wmnet after failed reboot
* 18:47 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3080.esams.wmnet with OS trixie
* 18:43 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3067.esams.wmnet with OS trixie
* 18:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3066.esams.wmnet with OS trixie
* 18:32 dwisehaupt@dns1005: END - running authdns-update
* 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast2003.wikimedia.org with OS bookworm
* 18:31 dwisehaupt@dns1005: START - running authdns-update
* 18:25 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 18:25 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 18:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3080.esams.wmnet with reason: host reimage
* 18:19 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[5-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:19 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7008.magru.wmnet
* 18:17 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:16 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3067.esams.wmnet with reason: host reimage
* 18:16 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 18:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3066.esams.wmnet with reason: host reimage
* 18:09 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3080.esams.wmnet with reason: host reimage
* 18:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 18:04 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3067.esams.wmnet with reason: host reimage
* 18:03 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3066.esams.wmnet with reason: host reimage
* 18:03 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 17:52 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1312-1327].eqiad.wmnet,wikikube-worker-exp1001.eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 17:52 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 17:52 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 17:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3080.esams.wmnet with OS trixie
* 17:44 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:43 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:42 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3081.esams.wmnet with OS trixie
* 17:42 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp3081.esams.wmnet with OS trixie
* 17:42 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:41 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid public cluster: Reboot Druid nodes
* 17:40 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:39 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7013.magru.wmnet,cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 17:39 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7016.magru.wmnet
* 17:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7007.magru.wmnet
* 17:37 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:33 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:31 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:29 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3067.esams.wmnet with OS trixie
* 17:29 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3067.esams.wmnet [reason: trixie reimaging]
* 17:28 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3066.esams.wmnet with OS trixie
* 17:28 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:27 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp3066.esams.wmnet with OS trixie
* 17:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3066.esams.wmnet with OS trixie
* 17:26 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:26 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3066.esams.wmnet [reason: trixie reimaging]
* 17:21 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:20 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:19 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3081.esams.wmnet with OS trixie
* 17:19 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:16 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet
* 17:16 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2069.codfw.wmnet
* 17:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 17:14 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:14 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:14 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:13 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:13 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:09 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7014.*
* 17:09 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet
* 17:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2069.codfw.wmnet
* 17:09 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:08 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1070.eqiad.wmnet
* 17:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet
* 17:07 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host bast2003.wikimedia.org with OS bookworm
* 17:06 cgoubert@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1312-1327].eqiad.wmnet,wikikube-worker-exp1001.eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 17:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['bast2003']
* 17:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2068.codfw.wmnet
* 17:02 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1070.eqiad.wmnet
* 17:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2067.codfw.wmnet
* 17:01 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1069.eqiad.wmnet
* 17:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 17:00 cgoubert@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7014.magru.wmnet with OS trixie
* 16:58 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 16:58 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 16:58 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 16:57 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7015.magru.wmnet
* 16:56 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7006.magru.wmnet
* 16:55 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1069.eqiad.wmnet
* 16:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2067.codfw.wmnet
* 16:53 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1068.eqiad.wmnet
* 16:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2066.codfw.wmnet
* 16:47 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:47 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist all cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 16:46 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 16:46 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['bast2003']
* 16:45 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1068.eqiad.wmnet
* 16:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2066.codfw.wmnet
* 16:44 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1067.eqiad.wmnet
* 16:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2065.codfw.wmnet
* 16:42 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid analytics cluster: Reboot Druid nodes
* 16:40 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 16:37 cgoubert@cumin1003: START - Cookbook sre.dns.netbox
* 16:36 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1067.eqiad.wmnet
* 16:36 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2065.codfw.wmnet
* 16:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2064.codfw.wmnet
* 16:35 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1066.eqiad.wmnet
* 16:34 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist group2 cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 16:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2003.codfw.wmnet with OS trixie
* 16:33 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7014.magru.wmnet with reason: host reimage
* 16:32 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:32 cgoubert@cumin1003: START - Cookbook sre.hosts.remove-downtime for wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:28 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7014.magru.wmnet with reason: host reimage
* 16:28 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1066.eqiad.wmnet
* 16:28 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2064.codfw.wmnet
* 16:25 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on releases2003.codfw.wmnet with reason: [[phab:T420246|T420246]]
* 16:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2063.codfw.wmnet
* 16:25 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1065.eqiad.wmnet
* 16:25 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:25 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1065.eqiad.wmnet
* 16:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2063.codfw.wmnet
* 16:18 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-jumbo-eqiad
* 16:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2062.codfw.wmnet
* 16:17 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1064.eqiad.wmnet
* 16:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7013.magru.wmnet
* 16:15 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2003.codfw.wmnet with reason: host reimage
* 16:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7005.magru.wmnet
* 16:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2062.codfw.wmnet
* 16:08 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2003.codfw.wmnet with reason: host reimage
* 16:07 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1064.eqiad.wmnet
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7014.magru.wmnet with OS trixie
* 16:05 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp7014.magru.wmnet with OS trixie
* 16:03 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7013.magru.wmnet,cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 16:03 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[5-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 15:54 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1009.eqiad.wmnet
* 15:54 mutante: zuul2003 - reimaging with trixie
* 15:52 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist group1 cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2033.codfw.wmnet
* 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2033.codfw.wmnet
* 15:46 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2003.codfw.wmnet with OS trixie
* 15:45 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1009.eqiad.wmnet
* 15:45 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1008.eqiad.wmnet
* 15:44 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist group0 cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 15:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2033.codfw.wmnet
* 15:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 15:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2048.codfw.wmnet
* 15:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2048.codfw.wmnet
* 15:36 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1008.eqiad.wmnet
* 15:36 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1007.eqiad.wmnet
* 15:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2048.codfw.wmnet
* 15:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1012.eqiad.wmnet with reason: host reimage
* 15:33 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist testwikis cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 15:32 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid analytics cluster: Reboot Druid nodes
* 15:28 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1007.eqiad.wmnet
* 15:28 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1006.eqiad.wmnet
* 15:27 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1012.eqiad.wmnet with reason: host reimage
* 15:27 samtar@deploy2002: mwscript-k8s job started: cleanupWatchlistLabelMember.php --wiki=testwiki # [[phab:T420328|T420328]]
* 15:27 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2008-dev.codfw.wmnet
* 15:25 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 15:23 jmm@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 15:22 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad
* 15:21 jmm@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 15:20 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2008-dev.codfw.wmnet
* 15:20 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1006.eqiad.wmnet
* 15:20 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1005.eqiad.wmnet
* 15:18 jmm@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 15:18 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244723{{!}}cleanup: Growth: Remove temporary GrowthMentorList overrides (T418518)]] (duration: 06m 32s)
* 15:16 jmm@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 15:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 16509
* 15:14 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2005-dev.codfw.wmnet
* 15:14 urbanecm@deploy2002: urbanecm: Continuing with sync
* 15:13 urbanecm@deploy2002: urbanecm: Backport for [[gerrit:1244723{{!}}cleanup: Growth: Remove temporary GrowthMentorList overrides (T418518)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:13 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 15:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2048.codfw.wmnet
* 15:11 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1244723{{!}}cleanup: Growth: Remove temporary GrowthMentorList overrides (T418518)]]
* 15:10 brennen@deploy2002: Finished deploy [phabricator/deployment@e845707]: deploy phab1004 for [[phab:T420366|T420366]] (duration: 01m 02s)
* 15:09 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1005.eqiad.wmnet
* 15:09 brennen@deploy2002: Started deploy [phabricator/deployment@e845707]: deploy phab1004 for [[phab:T420366|T420366]]
* 15:09 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254217{{!}}Create dblists for wikis where CheckUser and AbuseFilter are disabled (T420063 T420062)]] (duration: 06m 38s)
* 15:09 brennen@deploy2002: Finished deploy [phabricator/deployment@e845707]: deploy phab2002 for [[phab:T420366|T420366]] (duration: 00m 35s)
* 15:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudservices2005-dev.codfw.wmnet
* 15:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 15:08 brennen@deploy2002: Started deploy [phabricator/deployment@e845707]: deploy phab2002 for [[phab:T420366|T420366]]
* 15:08 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2009.codfw.wmnet
* 15:05 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 15:05 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 15:05 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 15:04 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1254217{{!}}Create dblists for wikis where CheckUser and AbuseFilter are disabled (T420063 T420062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:04 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7014.magru.wmnet with OS trixie
* 15:03 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host phab1004.eqiad.wmnet
* 15:02 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1254217{{!}}Create dblists for wikis where CheckUser and AbuseFilter are disabled (T420063 T420062)]]
* 15:02 topranks: reset BGP session to ssw1-d8-eiqad from lsw1-d4-eqiad [[phab:T420180|T420180]]
* 15:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:02 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 15:02 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2003-dev.codfw.wmnet
* 15:00 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2009.codfw.wmnet
* 15:00 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2008.codfw.wmnet
* 14:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
* 14:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
* 14:57 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host phab1004.eqiad.wmnet
* 14:55 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2003-dev.codfw.wmnet
* 14:55 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2002-dev.codfw.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum4004.ulsfo.wmnet
* 14:53 jmm@deploy2002: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 14:53 jmm@deploy2002: helmfile [staging] START helmfile.d/services/thumbor: apply
* 14:52 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2008.codfw.wmnet
* 14:52 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2007.codfw.wmnet
* 14:51 topranks: stop accepting routes on ssw1-d8-eqiad from external peers (cr2-eqiad, other spines) [[phab:T420351|T420351]]
* 14:51 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 14:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host durum4004.ulsfo.wmnet
* 14:50 topranks: stop announcing routes from ssw1-d8-eqiad to external peers (cr2-eqiad, other spines) [[phab:T420351|T420351]]
* 14:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
* 14:48 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2002-dev.codfw.wmnet
* 14:48 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2001-dev.codfw.wmnet
* 14:46 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2007.codfw.wmnet
* 14:46 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2006.codfw.wmnet
* 14:45 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 14:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:44 taavi: deploying cr firewall changes from https://gerrit.wikimedia.org/r/c/operations/homer/public/+/1254211
* 14:44 topranks: stop announcing "direct" routes to ssw1-d8-eqiad from cr2-eqiad [[phab:T420351|T420351]]
* 14:44 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2034.codfw.wmnet
* 14:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:43 moritzm: failover Ganeti master in codfw to ganeti2047
* 14:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2050.codfw.wmnet
* 14:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2050.codfw.wmnet
* 14:41 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2001-dev.codfw.wmnet
* 14:41 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2007-dev.codfw.wmnet
* 14:40 topranks: disabling EVPN IBGP peering from ssw1-d8-eqiad to ssw1-d1-eqiad to stop them reflecting routes [[phab:T420351|T420351]]
* 14:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1006.eqiad.wmnet
* 14:39 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 14:38 inflatador: bking@requestctl remove `wdqs_highest_error_rate_ever_seen` requestctl rule as it is no longer needed
* 14:38 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2006.codfw.wmnet
* 14:37 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2005.codfw.wmnet
* 14:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2050.codfw.wmnet
* 14:35 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2007-dev.codfw.wmnet
* 14:35 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2006-dev.codfw.wmnet
* 14:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1006.eqiad.wmnet
* 14:34 Daimona: Creating ce_event_goals DB table for the CampaignEvents extension in x1.testwiki, x1.test2wiki, x1.officewiki, and x1.wikishared # [[phab:T411433|T411433]]
* 14:34 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2050.codfw.wmnet
* 14:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2049.codfw.wmnet
* 14:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2049.codfw.wmnet
* 14:31 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 14:30 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2005.codfw.wmnet
* 14:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2049.codfw.wmnet
* 14:27 topranks: de-pref internet circuits landing on cr2-eqiad to shift traffic to cr1 [[phab:T420351|T420351]]
* 14:27 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2006-dev.codfw.wmnet
* 14:27 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2005-dev.codfw.wmnet
* 14:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2049.codfw.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2047.codfw.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2047.codfw.wmnet
* 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-test-presto1001.eqiad.wmnet
* 14:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-test-presto1001.eqiad.wmnet
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2047.codfw.wmnet
* 14:19 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2005-dev.codfw.wmnet
* 14:19 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb2004-dev.codfw.wmnet
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 14:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2047.codfw.wmnet
* 14:14 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2046.codfw.wmnet
* 14:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2046.codfw.wmnet
* 14:13 topranks: disable VRRP on cr2-eqiad interfaces facing ssw1-d8-eqiad [[phab:T420351|T420351]]
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 14:11 moritzm: powercycling ganeti2046 (stuck on reboot)
* 14:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
* 14:10 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudlb2004-dev.codfw.wmnet
* 14:10 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb2003-dev.codfw.wmnet
* 14:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 14:05 topranks: setting cr1-eqiad as VRRP master for all vlans [[phab:T420351|T420351]]
* 14:01 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudlb2003-dev.codfw.wmnet
* 14:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2046.codfw.wmnet
* 13:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6002.drmrs.wmnet
* 13:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6002.drmrs.wmnet
* 13:57 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 13:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2046.codfw.wmnet
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2045.codfw.wmnet
* 13:52 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudlb2002-dev.codfw.wmnet
* 13:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6002.drmrs.wmnet
* 13:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2045.codfw.wmnet
* 13:45 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254189{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]], [[gerrit:1254190{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]] (duration: 08m 10s)
* 13:44 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet
* 13:42 esanders@deploy2002: esanders: Continuing with sync
* 13:39 esanders@deploy2002: esanders: Backport for [[gerrit:1254189{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]], [[gerrit:1254190{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:38 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
* 13:38 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2004.codfw.wmnet
* 13:37 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1254189{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]], [[gerrit:1254190{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]]
* 13:35 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on logstash2023.codfw.wmnet with reason: ganeti reboot
* 13:33 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 13:32 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host apus-be2004.codfw.wmnet
* 13:32 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2045.codfw.wmnet
* 13:32 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2002.codfw.wmnet
* 13:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6002.drmrs.wmnet
* 13:30 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251610{{!}}Turn on postprocessing cache for all Parsoid parses (T348255)]] (duration: 10m 31s)
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 13:26 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be2002.codfw.wmnet
* 13:26 cscott@deploy2002: cscott: Continuing with sync
* 13:26 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2001.codfw.wmnet
* 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6001.drmrs.wmnet
* 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6001.drmrs.wmnet
* 13:22 cscott@deploy2002: cscott: Backport for [[gerrit:1251610{{!}}Turn on postprocessing cache for all Parsoid parses (T348255)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2044.codfw.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2044.codfw.wmnet
* 13:20 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be2001.codfw.wmnet
* 13:20 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker13[00-47].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 13:20 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1251610{{!}}Turn on postprocessing cache for all Parsoid parses (T348255)]]
* 13:20 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:19 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:19 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2280-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 13:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6001.drmrs.wmnet
* 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6001.drmrs.wmnet
* 13:16 btullis@cumin1003: END (PASS) - Cookbook sre.presto.reboot-workers (exit_code=0) for Presto an-presto cluster: Reboot Presto nodes
* 13:15 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2044.codfw.wmnet
* 13:15 aklapper@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254166{{!}}Remove misplaced readonly from CategoryViewer::$query (T420315)]] (duration: 06m 31s)
* 13:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint1001.eqiad.wmnet
* 13:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint1001.eqiad.wmnet
* 13:11 aklapper@deploy2002: zabe, aklapper: Continuing with sync
* 13:11 aklapper@deploy2002: zabe, aklapper: Backport for [[gerrit:1254166{{!}}Remove misplaced readonly from CategoryViewer::$query (T420315)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:10 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:10 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet
* 13:10 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 16509
* 13:09 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1004.eqiad.wmnet
* 13:09 aklapper@deploy2002: Started scap sync-world: Backport for [[gerrit:1254166{{!}}Remove misplaced readonly from CategoryViewer::$query (T420315)]]
* 13:08 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint2001.codfw.wmnet
* 13:04 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host apus-be1004.eqiad.wmnet
* 13:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint2001.codfw.wmnet
* 13:02 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2044.codfw.wmnet
* 13:02 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1002.eqiad.wmnet
* 13:01 moritzm: failover Ganeti masters in drmrs to ganeti6003/6004
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6004.drmrs.wmnet
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6004.drmrs.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2043.codfw.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2043.codfw.wmnet
* 12:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 214657
* 12:56 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 214657
* 12:56 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 56308
* 12:55 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 12:55 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 56308
* 12:55 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 28788
* 12:55 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be1002.eqiad.wmnet
* 12:55 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1001.eqiad.wmnet
* 12:54 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 28788
* 12:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6004.drmrs.wmnet
* 12:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2043.codfw.wmnet
* 12:53 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 28788
* 12:53 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 28788
* 12:53 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9269
* 12:52 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1012
* 12:52 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:51 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 9269
* 12:51 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1012
* 12:51 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-e8-eqiad
* 12:51 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-e8-eqiad
* 12:50 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:48 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1015
* 12:48 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be1001.eqiad.wmnet
* 12:45 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1015
* 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6004.drmrs.wmnet
* 12:44 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:44 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2043.codfw.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6003.drmrs.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6003.drmrs.wmnet
* 12:40 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 12:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2042.codfw.wmnet
* 12:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2042.codfw.wmnet
* 12:38 moritzm: powercycling ganeti2042 (stuck on reboot)
* 12:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6003.drmrs.wmnet
* 12:34 moritzm: powercycling ganeti2041 (stuck on reboot)
* 12:31 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6003.drmrs.wmnet
* 12:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install1005.wikimedia.org
* 12:22 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2042.codfw.wmnet
* 12:20 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-cluster
* 12:20 Emperor: roll-reboot apus frontends (codfw) for March reboots
* 12:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install1005.wikimedia.org
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install2005.wikimedia.org
* 12:13 topranks: restart BGP announcements from ssw1-d1-eqiad following change [[phab:T420180|T420180]]
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2042.codfw.wmnet
* 12:08 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2280-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 12:07 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install2005.wikimedia.org
* 12:06 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 12:06 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
* 12:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org
* 12:06 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
* 12:05 jayme@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=(registry1005.eqiad.wmnet{{!}}registry2005.codfw.wmnet)
* 12:05 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
* 12:05 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
* 12:04 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 12:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org
* 12:04 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install4003.wikimedia.org
* 12:03 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c7-eqiad [[phab:T420180|T420180]]
* 12:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2041.codfw.wmnet
* 12:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2041.codfw.wmnet
* 12:01 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
* 12:01 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
* 12:00 jayme@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=(registry1005.eqiad.wmnet{{!}}registry2005.codfw.wmnet)
* 12:00 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c6-eqiad [[phab:T420180|T420180]]
* 12:00 jayme@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=(registry1004.eqiad.wmnet{{!}}registry2004.codfw.wmnet)
* 11:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
* 11:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
* 11:59 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c4-eqiad [[phab:T420180|T420180]]
* 11:58 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c3-eqiad [[phab:T420180|T420180]]
* 11:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install4003.wikimedia.org
* 11:56 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c2-eqiad [[phab:T420180|T420180]]
* 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install5003.wikimedia.org
* 11:55 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
* 11:55 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
* 11:54 jayme@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=(registry1004.eqiad.wmnet{{!}}registry2004.codfw.wmnet)
* 11:54 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-d3-eqiad [[phab:T420180|T420180]]
* 11:53 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-d1-eqiad [[phab:T420180|T420180]]
* 11:52 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 11:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5007.eqsin.wmnet
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5007.eqsin.wmnet
* 11:49 btullis@cumin1003: START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster
* 11:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install5003.wikimedia.org
* 11:48 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org
* 11:47 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 11:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2041.codfw.wmnet
* 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5007.eqsin.wmnet
* 11:43 btullis@cumin1003: START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes
* 11:41 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet
* 11:41 cgoubert@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker13[00-47].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 11:39 topranks: stop accepting external routes on ssw1-d1-eqiad from cr1-eqiad [[phab:T420180|T420180]]
* 11:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org
* 11:33 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-cluster
* 11:33 Emperor: roll-reboot apus frontends (eqiad) for March reboots
* 11:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org
* 11:28 moritzm: failover Ganeti master in eqsin to ganeti5004
* 11:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2041.codfw.wmnet
* 11:24 topranks: reduce local-preference for BGP routes learnt from servers on cr1-eqiad [[phab:T420180|T420180]]
* 11:22 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:18 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2040.codfw.wmnet
* 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2040.codfw.wmnet
* 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2040.codfw.wmnet
* 11:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
* 11:05 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:04 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2040.codfw.wmnet
* 11:01 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5005.eqsin.wmnet
* 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5005.eqsin.wmnet
* 11:00 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:59 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:58 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:58 topranks: prepend external BGP announcements from cr1-eqiad [[phab:T420180|T420180]]
* 10:57 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:56 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:56 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2039.codfw.wmnet
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2039.codfw.wmnet
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5005.eqsin.wmnet
* 10:52 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 10:51 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:49 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:49 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:49 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5005.eqsin.wmnet
* 10:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2039.codfw.wmnet
* 10:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 10:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 10:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2039.codfw.wmnet
* 10:45 javiermonton@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 10:45 javiermonton@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2038.codfw.wmnet
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2038.codfw.wmnet
* 10:43 javiermonton@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 10:43 javiermonton@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
* 10:42 topranks: cease announcing routed networks from ssw1-d1-eqiad to cr1-eqiad in BGP [[phab:T420180|T420180]]
* 10:41 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:41 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:40 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:39 javiermonton@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 10:39 javiermonton@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 10:38 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2038.codfw.wmnet
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 10:37 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:33 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw2004-dev.codfw.wmnet
* 10:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2038.codfw.wmnet
* 10:29 topranks: stop announcing directly connected routes to L3 switches from cr1-eqiad [[phab:T420180|T420180]]
* 10:28 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:27 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudgw2004-dev.codfw.wmnet
* 10:27 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw2003-dev.codfw.wmnet
* 10:26 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:25 topranks: disable EVPN IBGP peering between ssw1-d1-eqiad and ssw1-d8-eqiad [[phab:T420180|T420180]]
* 10:24 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:21 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:21 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:20 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudgw2003-dev.codfw.wmnet
* 10:20 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:19 urbanecm: Delete `job/growthexperiments-listtaskcounts-29513771` from mw-cron (job stuck for more than a month)
* 10:19 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1003.eqiad.wmnet
* 10:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 10:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2037.codfw.wmnet
* 10:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2037.codfw.wmnet
* 10:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1003.eqiad.wmnet
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
* 10:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2037.codfw.wmnet
* 10:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
* 10:06 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2002.codfw.wmnet
* 10:05 topranks: disabling VRRP for et-1/0/5 sub-interfaces on cr1-eqiad [[phab:T420180|T420180]]
* 10:03 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:03 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:02 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:01 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:01 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:01 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:00 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2002.codfw.wmnet
* 10:00 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2001.codfw.wmnet
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:58 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2003.codfw.wmnet
* 09:57 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:57 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 09:57 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 09:56 topranks: shift traffic from codfw to eqiad off Arelion CCT to Lumen
* 09:56 mvernon@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling reboot on A:thanos-fe
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2003.codfw.wmnet
* 09:54 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2001.codfw.wmnet
* 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 09:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2037.codfw.wmnet
* 09:53 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:52 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:50 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:47 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 09:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 09:42 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 09:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2002.codfw.wmnet
* 09:38 moritzm: installing openssl bugfix updates on trixie hosts
* 09:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2002.codfw.wmnet
* 09:31 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2001.codfw.wmnet
* 09:25 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2001.codfw.wmnet
* 09:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
* 09:21 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 09:20 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
* 09:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 09:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 09:10 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254114{{!}}Revert "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] (duration: 12m 36s)
* 09:06 topranks: increase VRRP priority on eqiad vlans on CR2 to shift active gateway to cr2-eqiad [[phab:T420180|T420180]]
* 09:05 mvernon@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling reboot on A:thanos-fe
* 09:03 kharlan@deploy2002: kharlan: Continuing with sync
* 09:02 kharlan@deploy2002: kharlan: Backport for [[gerrit:1254114{{!}}Revert "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:58 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-canary
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2036.codfw.wmnet
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2036.codfw.wmnet
* 08:57 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1254114{{!}}Revert "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]]
* 08:57 moritzm: rebuilt the trixie d-i image for the 13.4 point release [[phab:T420240|T420240]]
* 08:54 kharlan@deploy2002: Sync cancelled.
* 08:52 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-canary
* 08:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2036.codfw.wmnet
* 08:49 kharlan@deploy2002: harroyo-wmf, kharlan: Backport for [[gerrit:1250575{{!}}hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend (T419125)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2036.codfw.wmnet
* 08:44 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host bast2003.wikimedia.org
* 08:43 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1250575{{!}}hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend (T419125)]]
* 08:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:42 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:35 arnaudb@cumin1003: END (PASS) - Cookbook sre.gerrit.restart-gerrit (exit_code=0) Restarting Gerrit on gerrit2002
* 08:34 arnaudb@cumin1003: START - Cookbook sre.gerrit.restart-gerrit Restarting Gerrit on gerrit2002
* 08:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 08:34 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host contint1002.wikimedia.org
* 08:32 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:28 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 08:27 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host contint1002.wikimedia.org
* 08:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2035.codfw.wmnet
* 08:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2035.codfw.wmnet
* 08:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host parsoidtest1001.eqiad.wmnet
* 08:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2035.codfw.wmnet
* 08:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti3005.esams.wmnet to cluster esams03 and group B
* 08:14 moritzm: powercycling bast2003 (stuck on reboot)
* 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti3005.esams.wmnet to cluster esams03 and group B
* 08:14 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host parsoidtest1001.eqiad.wmnet
* 08:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3005.esams.wmnet
* 08:09 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:08 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet
* 07:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
* 07:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5004.wikimedia.org
* 07:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti3005.esams.wmnet with OS bookworm
* 07:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5004.wikimedia.org
* 07:37 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 07:37 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 07:34 arnaudb@cumin1003: END (PASS) - Cookbook sre.gerrit.restart-gerrit (exit_code=0) Restarting Gerrit on gerrit2003
* 07:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2035.codfw.wmnet
* 07:32 arnaudb@cumin1003: START - Cookbook sre.gerrit.restart-gerrit Restarting Gerrit on gerrit2003
* 07:32 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2033.codfw.wmnet
* 07:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 07:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2032.codfw.wmnet
* 07:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2032.codfw.wmnet
* 07:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti3005.esams.wmnet with reason: host reimage
* 07:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti3005.esams.wmnet with reason: host reimage
* 07:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2032.codfw.wmnet
* 07:23 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2032.codfw.wmnet
* 07:00 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti3005.esams.wmnet with OS bookworm
* 06:08 kart_: Updated cxserver to 2026-03-16-071247-production ([[phab:T420004|T420004]])
* 06:07 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 06:06 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 06:05 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 06:04 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 05:58 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 05:58 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 04:41 dwisehaupt@dns1005: END - running authdns-update
* 04:39 dwisehaupt@dns1005: START - running authdns-update
* 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.17 (duration: 01m 17s)
* 03:43 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.20 refs [[phab:T413811|T413811]] (duration: 39m 34s)
* 03:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 10s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 00:26 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6009.*
* 00:25 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6009.drmrs.wmnet with OS trixie
* 00:07 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251158{{!}}Enable languages in main menu on Russian Wikipedia (T419730)]] (duration: 06m 57s)
* 00:03 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 00:02 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1251158{{!}}Enable languages in main menu on Russian Wikipedia (T419730)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1251158{{!}}Enable languages in main menu on Russian Wikipedia (T419730)]]
== 2026-03-16 ==
* 23:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6009.drmrs.wmnet with reason: host reimage
* 23:56 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253604{{!}}Don't output language HTML when no languages present (T419730)]], [[gerrit:1251157{{!}}Support duplication of languages in header and main menu (T419730)]] (duration: 06m 44s)
* 23:56 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6009.drmrs.wmnet with reason: host reimage
* 23:52 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 23:51 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1253604{{!}}Don't output language HTML when no languages present (T419730)]], [[gerrit:1251157{{!}}Support duplication of languages in header and main menu (T419730)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:50 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1253604{{!}}Don't output language HTML when no languages present (T419730)]], [[gerrit:1251157{{!}}Support duplication of languages in header and main menu (T419730)]]
* 23:36 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6009.drmrs.wmnet with OS trixie
* 23:32 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp601(0{{!}}1).*
* 22:54 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6008.drmrs.wmnet [reason: trixie reimaging]
* 22:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6008.drmrs.wmnet with OS trixie
* 22:37 jforrester@deploy2002: Finished scap sync-world: [[phab:T411807|T411807]] (duration: 11m 10s)
* 22:35 jforrester@deploy2002: jforrester: Continuing with sync
* 22:32 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6010.drmrs.wmnet with OS trixie
* 22:31 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp70[09-12].magru.wmnet<nowiki>}</nowiki> and A:cp
* 22:31 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7012.magru.wmnet
* 22:31 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 22:30 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[1-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 22:30 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7004.magru.wmnet
* 22:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6011.drmrs.wmnet with OS trixie
* 22:28 jforrester@deploy2002: jforrester: [[phab:T411807|T411807]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 jforrester@deploy2002: Started scap sync-world: [[phab:T411807|T411807]]
* 22:27 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 22:24 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 22:20 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 22:17 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1020-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 22:07 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6010.drmrs.wmnet with reason: host reimage
* 22:05 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6007.drmrs.wmnet [reason: trixie reimaging]
* 22:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6011.drmrs.wmnet with reason: host reimage
* 22:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6007.drmrs.wmnet with OS trixie
* 22:02 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6008.drmrs.wmnet with OS trixie
* 21:59 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6010.drmrs.wmnet with reason: host reimage
* 21:58 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6011.drmrs.wmnet with reason: host reimage
* 21:58 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp6008.drmrs.wmnet with OS trixie
* 21:52 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7011.magru.wmnet
* 21:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7003.magru.wmnet
* 21:42 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul1003.eqiad.wmnet with OS trixie
* 21:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 21:40 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6010.drmrs.wmnet with OS trixie
* 21:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6012.*
* 21:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6012.drmrs.wmnet with OS trixie
* 21:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6011.drmrs.wmnet with OS trixie
* 21:38 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6007.drmrs.wmnet with reason: host reimage
* 21:36 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6013.*
* 21:36 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 21:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6013.drmrs.wmnet with OS trixie
* 21:32 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6007.drmrs.wmnet with reason: host reimage
* 21:28 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul1003.eqiad.wmnet with reason: host reimage
* 21:22 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul1003.eqiad.wmnet with reason: host reimage
* 21:19 Dreamy_Jazz: Evening UTC backport window done
* 21:18 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251848{{!}}Disable CheckUser on closed wikis where no checks were ever made (T420062)]], [[gerrit:1251865{{!}}Uninstall SecurePoll from closed wikis (T420062)]], [[gerrit:1251888{{!}}DiscussionTools: Uninstall wikis closed before permalinks were deployed (T420052)]] (duration: 06m 10s)
* 21:17 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6008.drmrs.wmnet with OS trixie
* 21:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6008.drmrs.wmnet [reason: trixie reimaging]
* 21:15 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6006.drmrs.wmnet [reason: trixie reimaging]
* 21:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6006.drmrs.wmnet with OS trixie
* 21:14 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 21:14 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1251848{{!}}Disable CheckUser on closed wikis where no checks were ever made (T420062)]], [[gerrit:1251865{{!}}Uninstall SecurePoll from closed wikis (T420062)]], [[gerrit:1251888{{!}}DiscussionTools: Uninstall wikis closed before permalinks were deployed (T420052)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified the
* 21:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6012.drmrs.wmnet with reason: host reimage
* 21:12 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6007.drmrs.wmnet with OS trixie
* 21:12 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1251848{{!}}Disable CheckUser on closed wikis where no checks were ever made (T420062)]], [[gerrit:1251865{{!}}Uninstall SecurePoll from closed wikis (T420062)]], [[gerrit:1251888{{!}}DiscussionTools: Uninstall wikis closed before permalinks were deployed (T420052)]]
* 21:12 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6007.drmrs.wmnet [reason: trixie reimaging]
* 21:11 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6005.drmrs.wmnet [reason: trixie reimaging]
* 21:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6005.drmrs.wmnet with OS trixie
* 21:10 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7010.magru.wmnet
* 21:10 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7002.magru.wmnet
* 21:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6013.drmrs.wmnet with reason: host reimage
* 21:08 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul1003.eqiad.wmnet with OS trixie
* 21:07 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6012.drmrs.wmnet with reason: host reimage
* 21:06 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6013.drmrs.wmnet with reason: host reimage
* 21:05 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253623{{!}}Fix client credentials access tokens (T417278 T419921)]], [[gerrit:1253625{{!}}Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (T414338)]], [[gerrit:1253626{{!}}Configure $wgApiClientErrorSampleRate (T418957)]] (duration: 08m 06s)
* 21:01 catrope@deploy2002: matmarex, catrope: Continuing with sync
* 20:59 catrope@deploy2002: matmarex, catrope: Backport for [[gerrit:1253623{{!}}Fix client credentials access tokens (T417278 T419921)]], [[gerrit:1253625{{!}}Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (T414338)]], [[gerrit:1253626{{!}}Configure $wgApiClientErrorSampleRate (T418957)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:57 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1253623{{!}}Fix client credentials access tokens (T417278 T419921)]], [[gerrit:1253625{{!}}Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (T414338)]], [[gerrit:1253626{{!}}Configure $wgApiClientErrorSampleRate (T418957)]]
* 20:54 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:54 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cp[2027-2040].codfw.wmnet
* 20:50 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2027-2040].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:50 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2027-2040].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage
* 20:48 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6012.drmrs.wmnet with OS trixie
* 20:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6013.drmrs.wmnet with OS trixie
* 20:45 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2042.codfw.wmnet with reason: Testing hosts - not for production
* 20:45 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6005.drmrs.wmnet with reason: host reimage
* 20:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2042.codfw.wmnet with OS trixie
* 20:44 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephmon2007-dev.codfw.wmnet with OS bookworm
* 20:44 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 20:44 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253566{{!}}Configure external link aggregate usage on 12 wikis for top domains (T419837)]] (duration: 06m 59s)
* 20:43 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage
* 20:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2041.codfw.wmnet with reason: Testing hosts - not for production
* 20:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6005.drmrs.wmnet with reason: host reimage
* 20:40 kharlan@deploy2002: kharlan, mszwarc: Continuing with sync
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2041.codfw.wmnet with OS trixie
* 20:39 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 20:38 kharlan@deploy2002: kharlan, mszwarc: Backport for [[gerrit:1253566{{!}}Configure external link aggregate usage on 12 wikis for top domains (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:37 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1253566{{!}}Configure external link aggregate usage on 12 wikis for top domains (T419837)]]
* 20:34 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6014.*
* 20:33 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:33 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:32 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253551{{!}}Fix double post-processing in legacy preview case (T419908)]] (duration: 06m 52s)
* 20:29 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7009.magru.wmnet
* 20:28 cscott@deploy2002: cscott: Continuing with sync
* 20:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7001.magru.wmnet
* 20:27 cscott@deploy2002: cscott: Backport for [[gerrit:1253551{{!}}Fix double post-processing in legacy preview case (T419908)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:26 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1253551{{!}}Fix double post-processing in legacy preview case (T419908)]]
* 20:22 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 20:22 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6006.drmrs.wmnet with OS trixie
* 20:21 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6006.drmrs.wmnet [reason: trixie reimaging]
* 20:21 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6004.drmrs.wmnet [reason: trixie reimaging]
* 20:21 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6005.drmrs.wmnet with OS trixie
* 20:20 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6004.drmrs.wmnet with OS trixie
* 20:20 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6005.drmrs.wmnet [reason: trixie reimaging]
* 20:19 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6003.drmrs.wmnet [reason: trixie reimaging]
* 20:19 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephmon2007-dev.codfw.wmnet with reason: host reimage
* 20:19 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp70[09-12].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:18 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[1-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:17 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248665{{!}}Enable passwordless login in production (T419198)]], [[gerrit:1253572{{!}}Instrument clicks on external links to selected domains (T419837)]] (duration: 06m 43s)
* 20:16 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:15 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 20:15 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephmon2007-dev.codfw.wmnet with reason: host reimage
* 20:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6003.drmrs.wmnet with OS trixie
* 20:13 catrope@deploy2002: kharlan, catrope: Continuing with sync
* 20:12 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:12 catrope@deploy2002: kharlan, catrope: Backport for [[gerrit:1248665{{!}}Enable passwordless login in production (T419198)]], [[gerrit:1253572{{!}}Instrument clicks on external links to selected domains (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:12 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 20:11 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 20:10 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1248665{{!}}Enable passwordless login in production (T419198)]], [[gerrit:1253572{{!}}Instrument clicks on external links to selected domains (T419837)]]
* 20:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6014.drmrs.wmnet with OS trixie
* 20:03 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[2027-2040].codfw.wmnet
* 20:01 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251589{{!}}Uninstall GlobalBlocking from closed wikis (T420062)]] (duration: 08m 20s)
* 19:57 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 19:55 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6004.drmrs.wmnet with reason: host reimage
* 19:55 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephmon2007-dev.codfw.wmnet with OS bookworm
* 19:54 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1251589{{!}}Uninstall GlobalBlocking from closed wikis (T420062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:54 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp2042.codfw.wmnet with OS trixie
* 19:53 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephmon2007-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:53 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS trixie
* 19:52 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1251589{{!}}Uninstall GlobalBlocking from closed wikis (T420062)]]
* 19:52 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host cloudcephmon2007-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:51 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251582{{!}}Uninstall AbuseFilter from closed wikis with no AbuseFilter logs (T420063)]] (duration: 09m 26s)
* 19:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6003.drmrs.wmnet with reason: host reimage
* 19:47 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 19:47 mutante: releases2003 - rm rsync-srv-org-wikimedia-releases-releases2003.* - alerts flapping since server reboot - puppet code needs to be improved to ensure units are removed when primary server is switched ([[phab:T420246|T420246]])
* 19:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6004.drmrs.wmnet with reason: host reimage
* 19:46 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6003.drmrs.wmnet with reason: host reimage
* 19:44 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1251582{{!}}Uninstall AbuseFilter from closed wikis with no AbuseFilter logs (T420063)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6014.drmrs.wmnet with reason: host reimage
* 19:42 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1251582{{!}}Uninstall AbuseFilter from closed wikis with no AbuseFilter logs (T420063)]]
* 19:41 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudcephmon2007-dev
* 19:41 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudcephmon2007-dev
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating cloudcephmon2007-dev in codfw - jhancock@cumin2002"
* 19:40 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6014.drmrs.wmnet with reason: host reimage
* 19:39 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253622{{!}}Revert "Media: Use previous step for non-standard width between steps and original" (T419927)]] (duration: 07m 10s)
* 19:35 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 19:34 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1253622{{!}}Revert "Media: Use previous step for non-standard width between steps and original" (T419927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:32 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating cloudcephmon2007-dev in codfw - jhancock@cumin2002"
* 19:32 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253622{{!}}Revert "Media: Use previous step for non-standard width between steps and original" (T419927)]]
* 19:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp404[5-6].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 19:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet
* 19:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6004.drmrs.wmnet with OS trixie
* 19:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 19:27 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6004.drmrs.wmnet [reason: trixie reimaging]
* 19:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6003.drmrs.wmnet with OS trixie
* 19:26 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6003.drmrs.wmnet [reason: trixie reimaging]
* 19:25 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6002.drmrs.wmnet [reason: trixie reimaging]
* 19:25 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6001.drmrs.wmnet [reason: trixie reimaging]
* 19:21 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6014.drmrs.wmnet with OS trixie
* 19:17 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6002.drmrs.wmnet with OS trixie
* 19:17 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2042.codfw.wmnet with reason: Testing hosts - not for production
* 19:16 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2041.codfw.wmnet with reason: Testing hosts - not for production
* 19:15 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:15 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:12 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6001.drmrs.wmnet with OS trixie
* 19:02 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:02 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 18:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp4046.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:57 cdobbins@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet
* 18:52 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6002.drmrs.wmnet with reason: host reimage
* 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4045.ulsfo.wmnet
* 18:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6001.drmrs.wmnet with reason: host reimage
* 18:47 cdobbins@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp4046.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6002.drmrs.wmnet with reason: host reimage
* 18:45 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6001.drmrs.wmnet with reason: host reimage
* 18:39 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp404[5-6].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:38 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6015.*
* 18:38 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp403[7-9].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4039.ulsfo.wmnet
* 18:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6015.drmrs.wmnet with OS trixie
* 18:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6002.drmrs.wmnet with OS trixie
* 18:26 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6002.drmrs.wmnet [reason: trixie reimaging]
* 18:26 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6001.drmrs.wmnet with OS trixie
* 18:24 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6001.drmrs.wmnet [reason: trixie reimaging]
* 18:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage
* 17:59 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage
* 17:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4038.ulsfo.wmnet
* 17:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6015.drmrs.wmnet with OS trixie
* 17:37 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6016.*
* 17:32 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2042.codfw.wmnet with OS trixie
* 17:18 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4037.ulsfo.wmnet
* 17:08 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 17:06 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp403[7-9].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 17:03 fabfur@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 17:02 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6016.drmrs.wmnet with OS trixie
* 16:57 mutante: contint2002 - rebooting
* 16:47 mutante: phab2002 - rebooting
* 16:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:44 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253520{{!}}Add APCOND_OATH_HAS2FA to UserRequirementsPrivateConditions]] (duration: 06m 15s)
* 16:42 mutante: rebooting backends of releases.wikimedia.org
* 16:42 fabfur@cumin1003: START - Cookbook sre.hosts.reimage for host cp2042.codfw.wmnet with OS trixie
* 16:41 fabfur: reimage cp2042 for HAProxy testing ([[phab:T419825|T419825]])
* 16:41 mszwarc@deploy2002: mszwarc: Continuing with sync
* 16:40 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1253520{{!}}Add APCOND_OATH_HAS2FA to UserRequirementsPrivateConditions]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:39 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2041.codfw.wmnet with OS trixie
* 16:38 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1253520{{!}}Add APCOND_OATH_HAS2FA to UserRequirementsPrivateConditions]]
* 16:37 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1020-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 16:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6016.drmrs.wmnet with reason: host reimage
* 16:32 milimetric: my bad, accidentally merged https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1250249, will read docs on config deployment better
* 16:31 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1012
* 16:27 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1012
* 16:27 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6016.drmrs.wmnet with reason: host reimage
* 16:20 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]] (duration: 07m 28s)
* 16:17 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
* 16:16 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 16:14 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:13 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2005.codfw.wmnet
* 16:12 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]]
* 16:12 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
* 16:11 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
* 16:11 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=codfw
* 16:11 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 16:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
* 16:09 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1024.eqiad.wmnet
* 16:09 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1024.eqiad.wmnet
* 16:09 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1024.eqiad.wmnet
* 16:07 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1004-1007,1011-1012,1015-1016,1019-1021,1029-1031,1034-1168,1240-1289,1291-1327].eqiad.wmnet
* 16:06 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1004-1007,1011-1012,1015-1016,1019-1021,1029-1031,1034-1168,1240-1289,1291-1327].eqiad.wmnet
* 16:06 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6016.drmrs.wmnet with OS trixie
* 16:06 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2005.codfw.wmnet
* 16:06 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
* 16:05 dwisehaupt@dns1006: END - running authdns-update
* 16:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
* 16:05 fabfur@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 16:04 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=codfw
* 16:04 dwisehaupt@dns1006: START - running authdns-update
* 16:04 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=eqiad
* 16:00 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1004-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 15:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
* 15:59 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2031.codfw.wmnet
* 15:59 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti2031.codfw.wmnet
* 15:54 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2004.codfw.wmnet
* 15:53 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
* 15:52 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=eqiad
* 15:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
* 15:47 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2004.codfw.wmnet
* 15:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2031.codfw.wmnet
* 15:47 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
* 15:46 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1024.eqiad.wmnet with reason: Rebooting clouddb1024 [[phab:T419960|T419960]]
* 15:44 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1024.eqiad.wmnet
* 15:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2031.codfw.wmnet
* 15:43 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1023.eqiad.wmnet
* 15:43 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1023.eqiad.wmnet
* 15:43 fabfur@cumin1003: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS trixie
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet
* 15:42 fabfur: reimage cp2041 for HAProxy testing ([[phab:T419825|T419825]])
* 15:42 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:41 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:41 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2003.codfw.wmnet
* 15:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:38 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
* 15:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:37 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Rebooting clouddb1023 [[phab:T419960|T419960]]
* 15:35 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1022.eqiad.wmnet
* 15:35 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1022.eqiad.wmnet
* 15:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet
* 15:32 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2003.codfw.wmnet
* 15:32 dwisehaupt@dns1006: END - running authdns-update
* 15:32 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2002.codfw.wmnet
* 15:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 15:31 dwisehaupt@dns1006: START - running authdns-update
* 15:27 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe-codfw
* 15:26 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
* 15:26 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2029.codfw.wmnet
* 15:26 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti2029.codfw.wmnet
* 15:26 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:26 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:24 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:24 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:22 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2002.codfw.wmnet
* 15:21 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Rebooting clouddb1023 [[phab:T419960|T419960]]
* 15:20 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2001.codfw.wmnet
* 15:20 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]]
* 15:16 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Rebooting clouddb1022 [[phab:T419960|T419960]]
* 15:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet
* 15:11 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling reboot on A:durum and A:durum
* 15:09 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet
* 15:04 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2001.codfw.wmnet
* 15:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:02 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling reboot on A:wikidough
* 15:01 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw1004.eqiad.wmnet
* 15:01 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 15:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2028.codfw.wmnet
* 15:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2028.codfw.wmnet
* 14:56 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:55 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:55 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:54 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]]
* 14:53 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudgw1004.eqiad.wmnet
* 14:53 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-codfw
* 14:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2028.codfw.wmnet
* 14:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2028.codfw.wmnet
* 14:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:50 mvernon@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe-eqiad
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw1003.eqiad.wmnet with OS trixie
* 14:31 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:30 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:26 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:22 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1002-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1002-1003].eqiad.wmnet
* 14:22 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1002-1003].eqiad.wmnet
* 14:21 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1002-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:21 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1002-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:20 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:20 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:20 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:18 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253461{{!}}fix(anon warning): remove wring type=signup param (T415160)]], [[gerrit:1253450{{!}}AccountCreation: track account registrations for WE1.8 experiments (T416100)]] (duration: 09m 16s)
* 14:17 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:17 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:14 sgimeno@deploy2002: sgimeno: Continuing with sync
* 14:13 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:13 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:11 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:11 sgimeno@deploy2002: sgimeno: Backport for [[gerrit:1253461{{!}}fix(anon warning): remove wring type=signup param (T415160)]], [[gerrit:1253450{{!}}AccountCreation: track account registrations for WE1.8 experiments (T416100)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:11 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:10 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1003.eqiad.wmnet with reason: host reimage
* 14:09 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1253461{{!}}fix(anon warning): remove wring type=signup param (T415160)]], [[gerrit:1253450{{!}}AccountCreation: track account registrations for WE1.8 experiments (T416100)]]
* 14:08 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2026.codfw.wmnet
* 14:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 14:08 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:04 arnaudb@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gerrit2002.wikimedia.org with reason: testing
* 14:03 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1003.eqiad.wmnet with reason: host reimage
* 14:02 arnaudb@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 2:00:00 on gerrit2002.wikimedia.org with reason: [[phab:T418256|T418256]]
* 14:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1003.eqiad.wmnet
* 13:58 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 13:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1003.eqiad.wmnet
* 13:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet
* 13:45 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253046{{!}}bowiki: update logos (T419268)]] (duration: 06m 17s)
* 13:45 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudgw1003.eqiad.wmnet with OS trixie
* 13:43 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-codfw
* 13:43 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 13:41 mszwarc@deploy2002: mszwarc, anzx: Continuing with sync
* 13:41 mszwarc@deploy2002: mszwarc, anzx: Backport for [[gerrit:1253046{{!}}bowiki: update logos (T419268)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet
* 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2004.codfw.wmnet
* 13:39 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1253046{{!}}bowiki: update logos (T419268)]]
* 13:38 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253423{{!}}Always use external actor for interwiki rights logs on target wiki (T6055)]] (duration: 08m 53s)
* 13:34 mszwarc@deploy2002: mszwarc: Continuing with sync
* 13:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2004.codfw.wmnet
* 13:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2026.codfw.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2025.codfw.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet
* 13:31 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1253423{{!}}Always use external actor for interwiki rights logs on target wiki (T6055)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 13:29 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1253423{{!}}Always use external actor for interwiki rights logs on target wiki (T6055)]]
* 13:28 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 13:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet
* 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3004.esams.wmnet
* 13:25 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling reboot on A:wikidough
* 13:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 13:22 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling reboot on A:durum and A:durum
* 13:21 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2025.codfw.wmnet
* 13:21 XioNoX: drain edgeuno transit for optic replacement - [[phab:T415743|T415743]]
* 13:19 cgoubert@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host wikikube-ctrl1004.eqiad.wmnet
* 13:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3004.esams.wmnet
* 13:14 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251487{{!}}Replace direct BagOStuff with WANObjectCache (T419666)]] (duration: 11m 25s)
* 13:11 bwojtowicz@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet
* 13:09 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti3005.esams.wmnet
* 13:09 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ganeti3005.esams.wmnet
* 13:07 jforrester@deploy2002: jforrester: Continuing with sync
* 13:06 jforrester@deploy2002: jforrester: Backport for [[gerrit:1251487{{!}}Replace direct BagOStuff with WANObjectCache (T419666)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl1004.eqiad.wmnet
* 13:04 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ncredir4002.ulsfo.wmnet
* 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 13:03 jiji@cumin1003: END (ERROR) - Cookbook sre.memcached.roll-reboot-restart (exit_code=97) rolling reboot on A:memcached-gutter-eqiad
* 13:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 13:03 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1251487{{!}}Replace direct BagOStuff with WANObjectCache (T419666)]]
* 13:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5002.eqsin.wmnet
* 12:51 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl1003.eqiad.wmnet
* 12:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow5002.eqsin.wmnet
* 12:48 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet
* 12:44 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3005.esams.wmnet
* 12:42 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl1003.eqiad.wmnet
* 12:41 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl1002.eqiad.wmnet
* 12:40 aikochou@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet
* 12:37 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ncredir4002.ulsfo.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ncredir4001.ulsfo.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:35 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:34 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow7002.magru.wmnet
* 12:32 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:32 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:32 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:28 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1017
* 12:27 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow7002.magru.wmnet
* 12:27 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1017
* 12:25 aikochou@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:25 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl1002.eqiad.wmnet
* 12:22 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:20 moritzm: failover Ganeti master in esams to ganeti3008
* 12:20 moritzm: failover Ganeti master in esams to ganeti3005
* 12:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:14 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:10 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ncredir4001.ulsfo.wmnet
* 12:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti3006.esams.wmnet
* 12:00 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti3006.esams.wmnet
* 11:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for druid[1009-1013].eqiad.wmnet
* 11:57 btullis@cumin1003: START - Cookbook sre.hosts.remove-downtime for druid[1009-1013].eqiad.wmnet
* 11:57 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.remove-downtime (exit_code=97) for druid[1009-1013].eqiad.wmnet
* 11:57 btullis@cumin1003: START - Cookbook sre.hosts.remove-downtime for druid[1009-1013].eqiad.wmnet
* 11:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1009.eqiad.wmnet with OS bookworm
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3006.esams.wmnet
* 11:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1010.eqiad.wmnet with OS bookworm
* 11:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1011.eqiad.wmnet with OS bookworm
* 11:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1012.eqiad.wmnet with OS bookworm
* 11:30 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3006.esams.wmnet
* 11:29 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1013.eqiad.wmnet with OS bookworm
* 11:24 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:24 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:22 btullis@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on dse-k8s-worker[1012,1015-1017].eqiad.wmnet with reason: Adding 10 Gbps NIC
* 11:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1009.eqiad.wmnet with reason: host reimage
* 11:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1010.eqiad.wmnet with reason: host reimage
* 11:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:14 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:12 mvernon@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe-eqiad
* 11:12 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe-codfw
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1011.eqiad.wmnet with reason: host reimage
* 11:10 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:10 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1012.eqiad.wmnet with reason: host reimage
* 11:07 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit2003.wikimedia.org
* 11:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1013.eqiad.wmnet with reason: host reimage
* 11:06 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 11:06 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 11:04 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1010.eqiad.wmnet with reason: host reimage
* 11:02 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1011.eqiad.wmnet with reason: host reimage
* 11:02 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1009.eqiad.wmnet with reason: host reimage
* 11:01 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1012.eqiad.wmnet with reason: host reimage
* 11:00 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit2003.wikimedia.org
* 10:57 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1013.eqiad.wmnet with reason: host reimage
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2010.codfw.wmnet
* 10:47 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1013.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1012.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1011.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1010.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1009.eqiad.wmnet with OS bookworm
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3007.esams.wmnet
* 10:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2010.codfw.wmnet
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3007.esams.wmnet
* 10:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3007.esams.wmnet
* 10:29 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3007.esams.wmnet
* 10:28 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:28 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3008.esams.wmnet
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3008.esams.wmnet
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2009.codfw.wmnet
* 10:24 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:24 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:23 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2009.codfw.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3008.esams.wmnet
* 10:14 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3008.esams.wmnet
* 10:09 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:08 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2004.codfw.wmnet
* 10:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7001.magru.wmnet
* 10:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet
* 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2004.codfw.wmnet
* 09:56 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet
* 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts tcp-proxy4002.ulsfo.wmnet
* 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 09:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 09:51 elukey@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 09:51 elukey@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 09:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7001.magru.wmnet
* 09:51 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:46 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts tcp-proxy4002.ulsfo.wmnet
* 09:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: decom tcp-proxy4001 - jmm@cumin2002"
* 09:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: decom tcp-proxy4001 - jmm@cumin2002"
* 09:43 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm2001.wikimedia.org
* 09:39 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idm2001.wikimedia.org
* 09:38 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:38 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 09:38 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 09:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:35 slyngshede@dns1004: END - running authdns-update
* 09:34 elukey@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 09:34 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:34 elukey@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 09:33 slyngshede@dns1004: START - running authdns-update
* 09:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:30 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm1001.wikimedia.org
* 09:26 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 09:26 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idm1001.wikimedia.org
* 09:24 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm-test1001.wikimedia.org
* 09:22 moritzm: failover Ganeti master in magru to ganeti7004
* 09:21 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts tcp-proxy4001.ulsfo.wmnet
* 09:21 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-test-eqiad
* 09:20 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idm-test1001.wikimedia.org
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2002.codfw.wmnet
* 09:18 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:15 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM cloudidp2001-dev.codfw.wmnet
* 09:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2002.codfw.wmnet
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7002.magru.wmnet
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet
* 09:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2001.codfw.wmnet
* 09:13 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts tcp-proxy4001.ulsfo.wmnet
* 09:11 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM cloudidp2001-dev.codfw.wmnet
* 09:09 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp2005.wikimedia.org
* 09:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2001.codfw.wmnet
* 09:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet
* 09:05 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp2005.wikimedia.org
* 09:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7002.magru.wmnet
* 09:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet
* 08:59 slyngshede@dns1004: END - running authdns-update
* 08:58 slyngshede@dns1004: START - running authdns-update
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7003.magru.wmnet
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet
* 08:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet
* 08:49 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp1005.wikimedia.org
* 08:48 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:48 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 08:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet
* 08:48 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 08:47 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7003.magru.wmnet
* 08:44 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp1005.wikimedia.org
* 08:44 jmm@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-test-eqiad
* 08:44 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 08:39 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp-test1005.wikimedia.org
* 08:35 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp-test1005.wikimedia.org
* 08:33 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp-test2005.wikimedia.org
* 08:29 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp-test2005.wikimedia.org
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7004.magru.wmnet
* 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7004.magru.wmnet
* 08:22 taavi@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3
* 08:18 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251276{{!}}Fix broken survey links on PersonalDashboard (T419950)]] (duration: 32m 09s)
* 08:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7004.magru.wmnet
* 08:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7004.magru.wmnet
* 08:06 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 08:05 kgraessle@deploy2002: kgraessle: Continuing with sync
* 08:04 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1251276{{!}}Fix broken survey links on PersonalDashboard (T419950)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:59 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 07:52 moritzm: installing Linux 5.10.251 on Bullseye hosts
* 07:45 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1251276{{!}}Fix broken survey links on PersonalDashboard (T419950)]]
* 07:37 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stewards1001.eqiad.wmnet
* 07:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host stewards1001.eqiad.wmnet
* 07:33 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
* 07:26 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
* 07:25 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aphlict1002.eqiad.wmnet
* 07:21 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host aphlict1002.eqiad.wmnet
* 07:10 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host doc2003.codfw.wmnet
* 07:06 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host doc2003.codfw.wmnet
* 07:02 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
* 06:55 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
* 05:25 aikochou@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 52s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-15 ==
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 52s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-14 ==
* 14:16 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251941{{!}}CommonSettings: Set class in $wgCentralAuthRC]] (duration: 06m 17s)
* 14:12 reedy@deploy2002: reedy: Continuing with sync
* 14:11 reedy@deploy2002: reedy: Backport for [[gerrit:1251941{{!}}CommonSettings: Set class in $wgCentralAuthRC]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:10 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1251941{{!}}CommonSettings: Set class in $wgCentralAuthRC]]
* 12:51 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251912{{!}}CommonSettings: Specify class in IRC RCFeed setup]] (duration: 06m 19s)
* 12:47 reedy@deploy2002: reedy, lcawte: Continuing with sync
* 12:46 reedy@deploy2002: reedy, lcawte: Backport for [[gerrit:1251912{{!}}CommonSettings: Specify class in IRC RCFeed setup]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:44 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1251912{{!}}CommonSettings: Specify class in IRC RCFeed setup]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 00s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-13 ==
* 22:52 taavi: taavi@deploy2002 ~ $ mwscript CentralAuth:attachAccount.php --wiki=metawiki --userlist backfiller.txt # unify unified Special:CentralAuth/MediaWikiAccountBackfiller on meta
* 20:07 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 20:01 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 20:01 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 19:55 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4052.*
* 19:54 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 19:54 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS trixie
* 19:53 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2005-dev.codfw.wmnet
* 19:46 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2005-dev.codfw.wmnet
* 19:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 19:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4050.*
* 19:40 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 19:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 19:24 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 19:23 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4050.ulsfo.wmnet
* 19:19 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1035.eqiad.wmnet with OS trixie
* 19:19 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1034.eqiad.wmnet with OS trixie
* 19:18 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:18 jclark@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:18 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:16 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4051.*
* 19:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp4050.ulsfo.wmnet
* 19:14 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4050.ulsfo.wmnet
* 19:13 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:11 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4051.ulsfo.wmnet with OS trixie
* 19:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp4050.ulsfo.wmnet
* 19:02 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1035.eqiad.wmnet with reason: host reimage
* 19:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS bookworm
* 19:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:00 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 18:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1034.eqiad.wmnet with reason: host reimage
* 18:58 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie
* 18:57 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4052.ulsfo.wmnet with OS trixie
* 18:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1035.eqiad.wmnet with reason: host reimage
* 18:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1034.eqiad.wmnet with reason: host reimage
* 18:47 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4051.ulsfo.wmnet with reason: host reimage
* 18:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 18:41 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4051.ulsfo.wmnet with reason: host reimage
* 18:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1035.eqiad.wmnet with OS trixie
* 18:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1034.eqiad.wmnet with OS trixie
* 18:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1033.eqiad.wmnet with OS trixie
* 18:36 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp4050.ulsfo.wmnet with reason: firmware updates
* 18:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie
* 18:24 brett@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp4050.ulsfo.wmnet
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie
* 18:22 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS bookworm
* 18:21 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1374.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 18:21 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4051.ulsfo.wmnet with OS trixie
* 18:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4051.ulsfo.wmnet with OS trixie
* 18:12 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS bookworm
* 18:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1374.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 18:10 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:10 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 18:10 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 18:10 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1253.eqiad.wmnet with reason: Host went down and paged, depooled
* 18:06 cgoubert@cumin1003: dbctl commit (dc=all): 'Depool db1253', diff saved to https://phabricator.wikimedia.org/P89856 and previous config saved to /var/cache/conftool/dbconfig/20260313-180640-cgoubert.json
* 18:06 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 18:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4051.ulsfo.wmnet with OS trixie
* 18:03 elukey: powercycle db1253 - host not reachable via ssh, no events logged in racadm getsel, no console com2 available (blank screen)
* 17:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 17:56 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 17:49 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4049.*
* 17:46 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4049.ulsfo.wmnet with OS trixie
* 17:37 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:37 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:36 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 17:35 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4050.ulsfo.wmnet with OS trixie
* 17:35 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:34 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:27 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 17:26 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:26 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 17:26 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 17:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4049.ulsfo.wmnet with reason: host reimage
* 17:17 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 17:17 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:16 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:16 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4049.ulsfo.wmnet with reason: host reimage
* 17:12 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1016.eqiad.wmnet
* 17:12 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1016.eqiad.wmnet
* 17:11 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1016.eqiad.wmnet
* 17:11 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4048.*
* 17:10 dhinus: (relogging failed sal) conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet
* 17:10 dhinus: (relogging failed sal) DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1016.eqiad.wmnet with reason: Rebooting clouddb1016 [[phab:T419960|T419960]]
* 17:09 dhinus: (relogging failed sal) END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1015.eqiad.wmnet
* 17:08 dhinus: (relogging failed sal) START - Cookbook sre.hosts.remove-downtime for clouddb1015.eqiad.wmnet
* 17:08 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 17:07 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:07 dhinus: fnegri@cumin1003 conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet
* 17:07 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 17:07 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4048.ulsfo.wmnet with OS trixie
* 17:06 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 16:40 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4049.ulsfo.wmnet with OS trixie
* 16:39 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage
* 16:36 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet
* 16:35 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1015.eqiad.wmnet with reason: Rebooting clouddb1015 [[phab:T419960|T419960]]
* 16:34 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1014.eqiad.wmnet
* 16:34 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1014.eqiad.wmnet
* 16:34 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
* 16:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
* 16:28 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudweb1003.wikimedia.org
* 16:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
* 16:22 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudweb1003.wikimedia.org
* 16:21 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudweb1004.wikimedia.org
* 16:20 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1014.eqiad.wmnet with reason: Rebooting clouddb1014 [[phab:T419960|T419960]]
* 16:20 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1014.eqiad.wmnet
* 16:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor-dev2001.codfw.wmnet
* 16:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4048.ulsfo.wmnet with OS trixie
* 16:16 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudweb1004.wikimedia.org
* 16:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor-dev2001.codfw.wmnet
* 16:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor2003.codfw.wmnet
* 16:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet
* 16:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:00 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie
* 15:43 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2010-dev.codfw.wmnet
* 15:38 vgutierrez@cumin1003: END (PASS) - Cookbook sre.loadbalancer.check-ipip (exit_code=0)
* 15:38 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:37 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 15:37 vgutierrez@cumin1003: END (FAIL) - Cookbook sre.loadbalancer.check-ipip (exit_code=99)
* 15:37 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:36 vgutierrez@cumin1003: END (FAIL) - Cookbook sre.loadbalancer.check-ipip (exit_code=99)
* 15:36 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:36 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2010-dev.codfw.wmnet
* 15:35 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet
* 15:35 vgutierrez@cumin1003: END (FAIL) - Cookbook sre.loadbalancer.check-ipip (exit_code=99)
* 15:35 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:28 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
* 15:26 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:25 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:23 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:22 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 15:22 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 15:22 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 15:19 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:19 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:19 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:16 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2006-dev.codfw.wmnet
* 15:12 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudidp2001-dev.codfw.wmnet
* 15:08 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudidp2001-dev.codfw.wmnet
* 15:07 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2006-dev.codfw.wmnet
* 14:58 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
* 14:58 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
* 14:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s3
* 14:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s1
* 14:48 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1015.eqiad.wmnet
* 14:46 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet
* 14:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2003.codfw.wmnet
* 14:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1373.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1015.eqiad.wmnet
* 14:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1034.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:42 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:40 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1023
* 14:40 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1023
* 14:40 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1022
* 14:40 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1022
* 14:40 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1021
* 14:39 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup2004.codfw.wmnet
* 14:39 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1021
* 14:38 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
* 14:37 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1020
* 14:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2003.codfw.wmnet
* 14:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:35 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1020
* 14:35 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T419960|T419960]]
* 14:33 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:33 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
* 14:32 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:32 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:32 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1034.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1373.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:29 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:29 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt - jclark@cumin1003"
* 14:29 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt - jclark@cumin1003"
* 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti-test2002.codfw.wmnet
* 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 14:27 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup2004.codfw.wmnet
* 14:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cuminunpriv1001.eqiad.wmnet
* 14:25 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup2003.codfw.wmnet
* 14:25 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 14:25 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 14:24 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 14:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cuminunpriv1001.eqiad.wmnet
* 14:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 14:22 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1004.eqiad.wmnet
* 14:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti-test2002.codfw.wmnet
* 14:14 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup2003.codfw.wmnet
* 14:13 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup1004.eqiad.wmnet
* 14:09 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1003.eqiad.wmnet
* 14:01 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup1003.eqiad.wmnet
* 13:59 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit1003.wikimedia.org
* 13:53 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit1003.wikimedia.org
* 13:49 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists2001.wikimedia.org
* 13:48 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1004.eqiad.wmnet
* 13:46 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host etherpad1004.eqiad.wmnet
* 13:45 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:45 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:44 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1004.eqiad.wmnet
* 13:42 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists2001.wikimedia.org
* 13:42 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host etherpad1004.eqiad.wmnet
* 13:37 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host etherpad2002.codfw.wmnet
* 13:36 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit2002.wikimedia.org
* 13:33 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host etherpad2002.codfw.wmnet
* 13:32 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2003.wikimedia.org
* 13:30 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit2002.wikimedia.org
* 13:26 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2003.wikimedia.org
* 13:26 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 13:24 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2020.codfw.wmnet
* 13:23 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2019.codfw.wmnet
* 13:19 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 13:19 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org
* 13:13 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2020.codfw.wmnet
* 13:13 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org
* 13:12 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2019.codfw.wmnet
* 13:11 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.reboot-runner (exit_code=0) rolling reboot on A:gitlab-runner
* 13:05 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2018.codfw.wmnet
* 13:05 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1020.eqiad.wmnet
* 12:54 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2018.codfw.wmnet
* 12:54 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1020.eqiad.wmnet
* 12:54 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2017.codfw.wmnet
* 12:54 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1019.eqiad.wmnet
* 12:53 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:50 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:50 moritzm: powercycle pki1002
* 12:48 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:47 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:44 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:44 mutante: rebooted phab1005 - waiting for it to come back
* 12:44 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2017.codfw.wmnet
* 12:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1019.eqiad.wmnet
* 12:42 bwojtowicz@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:40 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1018.eqiad.wmnet
* 12:39 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2016.codfw.wmnet
* 12:31 jelto@cumin1003: START - Cookbook sre.gitlab.reboot-runner rolling reboot on A:gitlab-runner
* 12:29 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1018.eqiad.wmnet
* 12:29 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1017.eqiad.wmnet
* 12:28 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2016.codfw.wmnet
* 12:27 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2015.codfw.wmnet
* 12:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast1004.wikimedia.org
* 12:18 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host doc1004.eqiad.wmnet
* 12:18 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1017.eqiad.wmnet
* 12:17 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:17 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:15 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2015.codfw.wmnet
* 12:15 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:15 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:14 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host doc1004.eqiad.wmnet
* 12:13 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aphlict2001.codfw.wmnet
* 12:10 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host aphlict2001.codfw.wmnet
* 12:10 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: reboot
* 12:10 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts2002.codfw.wmnet
* 12:07 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:07 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:03 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts2002.codfw.wmnet
* 12:02 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:02 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:01 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1016.eqiad.wmnet
* 12:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1019.eqiad.wmnet
* 11:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1018.eqiad.wmnet
* 11:59 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:59 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1019.eqiad.wmnet
* 11:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1018.eqiad.wmnet
* 11:51 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:51 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:50 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1016.eqiad.wmnet
* 11:49 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup2004.codfw.wmnet
* 11:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup2004.codfw.wmnet
* 11:43 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup1004.eqiad.wmnet
* 11:37 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup1004.eqiad.wmnet
* 11:36 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup2003.codfw.wmnet
* 11:34 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup1003.eqiad.wmnet
* 11:32 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 11:32 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:30 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup2003.codfw.wmnet
* 11:28 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup1003.eqiad.wmnet
* 11:27 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:26 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy1001.eqiad.wmnet
* 11:21 arnaudb@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host contint1003.wikimedia.org
* 11:21 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy1001.eqiad.wmnet
* 11:21 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy1002.eqiad.wmnet
* 11:16 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy1002.eqiad.wmnet
* 11:16 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy2001.codfw.wmnet
* 11:16 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host contint1003.wikimedia.org
* 11:12 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-master-codfw
* 11:12 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul1001.eqiad.wmnet
* 11:11 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy2001.codfw.wmnet
* 11:11 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy2002.codfw.wmnet
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1018.eqiad.wmnet with reason: host reimage
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:09 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-master-eqiad
* 11:08 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:08 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul1001.eqiad.wmnet
* 11:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1008-dev.eqiad.wmnet
* 11:07 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy2002.codfw.wmnet
* 11:06 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy3001.esams.wmnet
* 11:05 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1018.eqiad.wmnet with reason: host reimage
* 11:01 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy3001.esams.wmnet
* 11:01 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1008-dev.eqiad.wmnet
* 11:01 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1002-dev.eqiad.wmnet
* 11:01 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy3002.esams.wmnet
* 10:59 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 22:00:00 on db1258.eqiad.wmnet with reason: depooled, likely to flap over the weekend
* 10:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudbackup1002-dev.eqiad.wmnet
* 10:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1001-dev.eqiad.wmnet
* 10:56 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy3002.esams.wmnet
* 10:56 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-master-codfw
* 10:55 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4001.ulsfo.wmnet
* 10:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-worker-codfw
* 10:54 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudbackup1001-dev.eqiad.wmnet
* 10:52 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-master-eqiad
* 10:50 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-worker-eqiad
* 10:50 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 10:50 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4001.ulsfo.wmnet
* 10:50 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4002.ulsfo.wmnet
* 10:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1019.eqiad.wmnet with reason: host reimage
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1019.eqiad.wmnet with reason: host reimage
* 10:45 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4002.ulsfo.wmnet
* 10:45 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy5001.eqsin.wmnet
* 10:40 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy5001.eqsin.wmnet
* 10:39 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy5002.eqsin.wmnet
* 10:37 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul2001.codfw.wmnet
* 10:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool', diff saved to https://phabricator.wikimedia.org/P89852 and previous config saved to /var/cache/conftool/dbconfig/20260313-103719-ladsgroup.json
* 10:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul2001.codfw.wmnet
* 10:32 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy5002.eqsin.wmnet
* 10:31 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudweb2002-dev.wikimedia.org
* 10:31 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul1002.eqiad.wmnet
* 10:31 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy6001.drmrs.wmnet
* 10:29 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:29 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:28 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:28 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 10:27 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 10:27 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:27 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul1002.eqiad.wmnet
* 10:27 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:26 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy6001.drmrs.wmnet
* 10:24 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudweb2002-dev.wikimedia.org
* 10:23 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy6002.drmrs.wmnet
* 10:22 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul2002.codfw.wmnet
* 10:19 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1008.eqiad.wmnet
* 10:18 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul2002.codfw.wmnet
* 10:18 arnaudb@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host zuul2002.codfw.wmnet
* 10:18 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul2002.codfw.wmnet
* 10:18 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy6002.drmrs.wmnet
* 10:16 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy7002.magru.wmnet
* 10:16 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-worker-eqiad
* 10:15 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-worker-codfw
* 10:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1008.eqiad.wmnet
* 10:13 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1007.eqiad.wmnet
* 10:12 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy7002.magru.wmnet
* 10:09 jelto@cumin1003: conftool action : set/pooled=yes; selector: name=tcp-proxy7001.magru.wmnet
* 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1007.eqiad.wmnet
* 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1006.eqiad.wmnet
* 10:07 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy7001.magru.wmnet
* 10:03 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy7001.magru.wmnet
* 10:02 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1006.eqiad.wmnet
* 10:02 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1005.eqiad.wmnet
* 10:01 jelto@cumin1003: conftool action : set/pooled=no; selector: name=tcp-proxy7001.magru.wmnet
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 09:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1005.eqiad.wmnet
* 09:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1004.eqiad.wmnet
* 09:51 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1004.eqiad.wmnet
* 09:51 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1003.eqiad.wmnet
* 09:50 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:50 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:46 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1003.eqiad.wmnet
* 09:46 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1002.eqiad.wmnet
* 09:41 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1002.eqiad.wmnet
* 09:40 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1001.eqiad.wmnet
* 09:39 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:39 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:35 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1001.eqiad.wmnet
* 09:35 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-ctrl1002.eqiad.wmnet
* 09:34 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-ctrl1001.eqiad.wmnet
* 09:34 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:33 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:32 moritzm: installing Linux 6.1.164 on Bookworm hosts
* 09:30 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-ctrl1002.eqiad.wmnet
* 09:28 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-ctrl1001.eqiad.wmnet
* 09:01 elukey@cumin1003: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 08:37 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 07:56 moritzm: installing Linux 6.12.74 on Trixie hosts
* 07:55 moritzm: installing 6.12.74 on Trixie hosts
* 02:57 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4044.ulsfo.wmnet [reason: trixie reimaging]
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 18s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:41 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4044.ulsfo.wmnet with OS trixie
* 01:37 mutante: contint1003/contint2003 - every time(?) we setup machines with puppet using our httpd module and PHP - and puppet runs for the first time we run into the same old issue with "Exec[ensure_present_mod_php" failing and "Considering conflict mpm_worker for mpm_prefork"sudo a2dismod mpm_event". The fix is: 'sudo a2dismod mpm_event' and run puppet again. [[phab:T418521|T418521]]
* 01:26 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on contint1003.wikimedia.org with reason: [[phab:T418521|T418521]]
* 01:26 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on contint2003.wikimedia.org with reason: [[phab:T418521|T418521]]
* 01:23 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on contint2003.wikimedia.org with reason: setup
* 01:22 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on contint1003.wikimedia.org with reason: setup
* 01:22 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4047.*
* 01:09 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 01:08 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4043.ulsfo.wmnet [reason: trixie reimaging]
* 01:06 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 01:05 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4043.ulsfo.wmnet with OS trixie
* 00:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4047.ulsfo.wmnet with OS trixie
* 00:45 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4044.ulsfo.wmnet with OS trixie
* 00:45 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4044.ulsfo.wmnet [reason: trixie reimaging]
* 00:42 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4042.ulsfo.wmnet [reason: trixie reimaging]
* 00:41 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4042.ulsfo.wmnet with OS trixie
* 00:39 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4043.ulsfo.wmnet with reason: host reimage
* 00:31 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4043.ulsfo.wmnet with reason: host reimage
* 00:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4047.ulsfo.wmnet with reason: host reimage
* 00:27 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1251187 [[phab:T419637|T419637]] (duration: 07m 12s)
* 00:23 rzl@deploy2002: rzl: Continuing with sync
* 00:23 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4047.ulsfo.wmnet with reason: host reimage
* 00:22 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1251187 [[phab:T419637|T419637]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:21 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1251187 [[phab:T419637|T419637]]
* 00:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 00:14 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4040.ulsfo.wmnet [reason: trixie reimaging]
* 00:11 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 00:11 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4043.ulsfo.wmnet with OS trixie
* 00:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4040.ulsfo.wmnet with OS trixie
* 00:04 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4047.ulsfo.wmnet with OS trixie
* 00:03 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4047.ulsfo.wmnet with OS trixie
* 00:03 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4043.ulsfo.wmnet with OS trixie
== 2026-03-12 ==
* 23:57 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host o11ytest1001.eqiad.wmnet with OS trixie
* 23:53 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 23:53 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 23:50 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie
* 23:49 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 23:49 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 23:45 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 23:45 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 23:45 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 23:44 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4042.ulsfo.wmnet with OS trixie
* 23:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 23:41 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 23:41 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 23:40 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on o11ytest1001.eqiad.wmnet with reason: host reimage
* 23:36 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 23:36 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on o11ytest1001.eqiad.wmnet with reason: host reimage
* 23:36 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 23:35 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 23:35 rzl@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 23:22 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host o11ytest1001
* 23:22 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest1001
* 23:21 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie
* 23:19 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4040.ulsfo.wmnet with OS trixie
* 23:18 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest1001
* 23:18 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest1001.eqiad.wmnet 141.32.64.10.in-addr.arpa 1.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 23:18 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest1001.eqiad.wmnet 141.32.64.10.in-addr.arpa 1.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 23:18 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:18 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest1001 - herron@cumin1003"
* 23:18 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest1001 - herron@cumin1003"
* 23:04 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4047.ulsfo.wmnet with OS trixie
* 23:00 herron@cumin1003: START - Cookbook sre.dns.netbox
* 23:00 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host o11ytest1001
* 22:59 herron@cumin1003: START - Cookbook sre.hosts.reimage for host o11ytest1001.eqiad.wmnet with OS trixie
* 22:58 herron@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mwlog1002 to o11ytest1001
* 22:57 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest1001
* 22:55 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest1001
* 22:55 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest1001 on all recursors
* 22:55 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest1001 on all recursors
* 22:55 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:55 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog1002 to o11ytest1001 - herron@cumin1003"
* 22:54 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog1002 to o11ytest1001 - herron@cumin1003"
* 22:51 herron@cumin1003: START - Cookbook sre.dns.netbox
* 22:50 herron@cumin1003: START - Cookbook sre.hosts.rename from mwlog1002 to o11ytest1001
* 22:42 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4043.ulsfo.wmnet with OS trixie
* 22:42 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4043.ulsfo.wmnet [reason: trixie reimaging]
* 22:41 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4041.ulsfo.wmnet [reason: trixie reimaging]
* 22:39 bvibber@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251190{{!}}Enable ReaderExperiments Share Highlight subfeature for metrics (T416945)]], [[gerrit:1251195{{!}}Metrics module for share highlight experiment baseline (T416945)]] (duration: 06m 49s)
* 22:38 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4041.ulsfo.wmnet with OS trixie
* 22:35 bvibber@deploy2002: bvibber: Continuing with sync
* 22:34 bvibber@deploy2002: bvibber: Backport for [[gerrit:1251190{{!}}Enable ReaderExperiments Share Highlight subfeature for metrics (T416945)]], [[gerrit:1251195{{!}}Metrics module for share highlight experiment baseline (T416945)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:32 bvibber@deploy2002: Started scap sync-world: Backport for [[gerrit:1251190{{!}}Enable ReaderExperiments Share Highlight subfeature for metrics (T416945)]], [[gerrit:1251195{{!}}Metrics module for share highlight experiment baseline (T416945)]]
* 22:28 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1251182 [[phab:T419637|T419637]] (duration: 11m 18s)
* 22:27 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host o11ytest2001.codfw.wmnet with OS trixie
* 22:26 rzl@deploy2002: rzl: Continuing with sync
* 22:24 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1251182 [[phab:T419637|T419637]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:23 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie
* 22:23 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4042.ulsfo.wmnet [reason: trixie reimaging]
* 22:20 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4046.*
* 22:17 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1251182 [[phab:T419637|T419637]]
* 22:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4041.ulsfo.wmnet with reason: host reimage
* 22:09 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on o11ytest2001.codfw.wmnet with reason: host reimage
* 22:08 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4041.ulsfo.wmnet with reason: host reimage
* 22:03 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on o11ytest2001.codfw.wmnet with reason: host reimage
* 22:01 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:58 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie
* 21:58 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4040.ulsfo.wmnet [reason: trixie reimaging]
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host o11ytest2001
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest2001
* 21:45 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest2001
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest2001.codfw.wmnet 9.32.192.10.in-addr.arpa 9.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:45 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest2001.codfw.wmnet 9.32.192.10.in-addr.arpa 9.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest2001 - herron@cumin1003"
* 21:45 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest2001 - herron@cumin1003"
* 21:43 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4041.ulsfo.wmnet with OS trixie
* 21:41 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4038.ulsfo.wmnet [reason: trixie reimaging]
* 21:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4038.ulsfo.wmnet with OS trixie
* 21:39 herron@cumin1003: START - Cookbook sre.dns.netbox
* 21:39 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host o11ytest2001
* 21:39 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:39 herron@cumin1003: START - Cookbook sre.hosts.reimage for host o11ytest2001.codfw.wmnet with OS trixie
* 21:36 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:35 herron@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mwlog2002 to o11ytest2001
* 21:35 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:35 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:35 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest2001
* 21:34 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:34 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:33 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:32 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest2001
* 21:32 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest2001 on all recursors
* 21:32 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest2001 on all recursors
* 21:32 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:32 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog2002 to o11ytest2001 - herron@cumin1003"
* 21:31 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog2002 to o11ytest2001 - herron@cumin1003"
* 21:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4046.ulsfo.wmnet with OS trixie
* 21:27 herron@cumin1003: START - Cookbook sre.dns.netbox
* 21:26 herron@cumin1003: START - Cookbook sre.hosts.rename from mwlog2002 to o11ytest2001
* 21:20 rzl: rzl@apt1002:~$ sudo -i reprepro copy trixie-wikimedia bullseye-wikimedia envoyproxy
* 21:20 rzl: rzl@apt1002:~$ sudo -i reprepro copy bookworm-wikimedia bullseye-wikimedia envoyproxy
* 21:20 rzl: rzl@apt1002:~$ sudo -i reprepro -C main includedeb bullseye-wikimedia /srv/wikimedia/pool/component/envoy-future/e/envoyproxy/envoyproxy_1.35.9-1_amd64.deb
* 21:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 21:13 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251173{{!}}Revert "Move post-processing of flaggedrevs views inside FlaggablePageView"]] (duration: 07m 28s)
* 21:09 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 21:09 cscott@deploy2002: cscott: Continuing with sync
* 21:07 cscott@deploy2002: cscott: Backport for [[gerrit:1251173{{!}}Revert "Move post-processing of flaggedrevs views inside FlaggablePageView"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:05 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1251173{{!}}Revert "Move post-processing of flaggedrevs views inside FlaggablePageView"]]
* 21:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 21:02 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251152{{!}}Use 'alwaysShowLogin' query parameter during login (T419723)]], [[gerrit:1251150{{!}}login: Add 'alwaysShowLogin' login URL parameter (T419723)]], [[gerrit:1251168{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1250750{{!}}Enable parser survey for opted-out users on ru/pt/ja/id wikis (T414852)]] (duration: 10m 41s)
* 20:58 tgr@deploy2002: tgr, jsn, cscott: Continuing with sync
* 20:58 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 20:54 tgr@deploy2002: tgr, jsn, cscott: Backport for [[gerrit:1251152{{!}}Use 'alwaysShowLogin' query parameter during login (T419723)]], [[gerrit:1251150{{!}}login: Add 'alwaysShowLogin' login URL parameter (T419723)]], [[gerrit:1251168{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1250750{{!}}Enable parser survey for opted-out users on ru/pt/ja/id wikis (T414852)]] synced to the testservers (see https://wikitech
* 20:52 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1251152{{!}}Use 'alwaysShowLogin' query parameter during login (T419723)]], [[gerrit:1251150{{!}}login: Add 'alwaysShowLogin' login URL parameter (T419723)]], [[gerrit:1251168{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1250750{{!}}Enable parser survey for opted-out users on ru/pt/ja/id wikis (T414852)]]
* 20:49 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 20:43 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251087{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251088{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251106{{!}}phpunit: Avoid unnecessary writes in generatePHPUnitConfig.php (T419107)]] (duration: 07m 37s)
* 20:39 tgr@deploy2002: tgr, daimona: Continuing with sync
* 20:37 tgr@deploy2002: tgr, daimona: Backport for [[gerrit:1251087{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251088{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251106{{!}}phpunit: Avoid unnecessary writes in generatePHPUnitConfig.php (T419107)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:37 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4046.ulsfo.wmnet with OS trixie
* 20:35 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1251087{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251088{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251106{{!}}phpunit: Avoid unnecessary writes in generatePHPUnitConfig.php (T419107)]]
* 20:35 jsn@deploy2002: Synchronized wmf-config/throttle.php: (no justification provided) (duration: 01m 57s)
* 20:32 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4045.*
* 20:28 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4041.ulsfo.wmnet with OS trixie
* 20:20 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 20:18 jsn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249364{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1251140{{!}}[arwikiquote] add namespace alias for NS_PROJECT (T419828)]], [[gerrit:1251098{{!}}Deploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1251164{{!}}Increase IP cap limit for azwiki (T419899)]] (duration: 11m 11s)
* 20:14 jsn@deploy2002: jsn, dani, nmw03, gergesshamon: Continuing with sync
* 20:09 jsn@deploy2002: jsn, dani, nmw03, gergesshamon: Backport for [[gerrit:1249364{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1251140{{!}}[arwikiquote] add namespace alias for NS_PROJECT (T419828)]], [[gerrit:1251098{{!}}Deploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1251164{{!}}Increase IP cap limit for azwiki (T419899)]] synced to the testservers (see https://wikitech.wikimedia.org/wik
* 20:07 jsn@deploy2002: Started scap sync-world: Backport for [[gerrit:1249364{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1251140{{!}}[arwikiquote] add namespace alias for NS_PROJECT (T419828)]], [[gerrit:1251098{{!}}Deploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1251164{{!}}Increase IP cap limit for azwiki (T419899)]]
* 19:21 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
* 19:21 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mathoid: apply
* 19:20 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
* 19:19 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mathoid: apply
* 19:16 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 19:16 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 19:15 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 19:14 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 19:13 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 19:12 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 19:12 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 19:11 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 19:07 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4041.ulsfo.wmnet with OS trixie
* 19:06 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4041.ulsfo.wmnet [reason: trixie reimaging]
* 19:06 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4039.ulsfo.wmnet [reason: trixie reimaging]
* 19:06 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 brennen@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251138{{!}}EditPage: Re-add catch block for MWException (T419883)]] (duration: 09m 46s)
* 19:04 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:04 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:04 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:04 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:03 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:03 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:01 brennen@deploy2002: somerandomdeveloper, brennen: Continuing with sync
* 18:59 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 18:57 brennen@deploy2002: somerandomdeveloper, brennen: Backport for [[gerrit:1251138{{!}}EditPage: Re-add catch block for MWException (T419883)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:55 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4039.ulsfo.wmnet with OS trixie
* 18:55 brennen@deploy2002: Started scap sync-world: Backport for [[gerrit:1251138{{!}}EditPage: Re-add catch block for MWException (T419883)]]
* 18:52 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mathoid: apply
* 18:52 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mathoid: apply
* 18:42 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp20(2[789]{{!}}3[0-9]{{!}}40).*,service=ats-be
* 18:34 brennen@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 18:29 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4039.ulsfo.wmnet with reason: host reimage
* 18:29 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:28 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Updating dse-k8s-worker1019 - btullis@cumin1003"
* 18:26 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2332.codfw.wmnet
* 18:26 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2332.codfw.wmnet
* 18:25 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Updating dse-k8s-worker1019 - btullis@cumin1003"
* 18:24 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4039.ulsfo.wmnet with reason: host reimage
* 18:23 brennen@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251139{{!}}Ensure that we always run ParserHooks::transformHtml() when using Parsoid (T419830)]] (duration: 14m 46s)
* 18:21 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 18:20 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4038.ulsfo.wmnet with OS trixie
* 18:19 brennen@deploy2002: cscott, brennen: Continuing with sync
* 18:18 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 18:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4045.ulsfo.wmnet with OS trixie
* 18:10 brennen@deploy2002: cscott, brennen: Backport for [[gerrit:1251139{{!}}Ensure that we always run ParserHooks::transformHtml() when using Parsoid (T419830)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:10 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1019.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:08 brennen@deploy2002: Started scap sync-world: Backport for [[gerrit:1251139{{!}}Ensure that we always run ParserHooks::transformHtml() when using Parsoid (T419830)]]
* 18:02 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1019.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:02 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 17:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4039.ulsfo.wmnet with OS trixie
* 17:58 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1019
* 17:58 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1019
* 17:56 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4039.ulsfo.wmnet [reason: trixie reimaging]
* 17:55 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp20(3[6-9]{{!}}4[012]).*
* 17:54 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:53 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet [reason: trixie reimaging]
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4045.ulsfo.wmnet with reason: host reimage
* 17:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4037.ulsfo.wmnet with OS trixie
* 17:49 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4045.ulsfo.wmnet with reason: host reimage
* 17:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:33 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:31 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1018
* 17:31 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1018
* 17:30 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:28 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS trixie
* 17:28 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4045.ulsfo.wmnet with OS trixie
* 17:27 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp203[0-5].*
* 17:24 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4037.ulsfo.wmnet with reason: host reimage
* 17:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:20 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:18 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4037.ulsfo.wmnet with reason: host reimage
* 17:17 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-backup1004.eqiad.wmnet with OS trixie
* 17:17 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 17:16 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 17:06 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp202[89].*
* 17:03 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp2027.*
* 16:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 16:58 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4038.ulsfo.wmnet [reason: trixie reimaging]
* 16:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup1004.eqiad.wmnet with reason: host reimage
* 16:58 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:57 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4037.ulsfo.wmnet with OS trixie
* 16:57 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet [reason: trixie reimaging]
* 16:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup1004.eqiad.wmnet with reason: host reimage
* 16:50 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:45 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 16:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 16:44 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 16:43 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 16:43 swfrench-wmf: reprepro include dh-php_5.5+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:42 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 16:41 swfrench-wmf: reprepro include php-defaults_94+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:37 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-backup1004.eqiad.wmnet with OS trixie
* 16:36 swfrench-wmf: reprepro include php8.3_8.3.30-1+wmf11u2+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:27 dzahn@dns1004: END - running authdns-update
* 16:26 dzahn@dns1004: START - running authdns-update
* 16:25 mutante: switching old status.wikimedia.org page away from rackspace [[phab:T414098|T414098]]
* 16:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS trixie
* 16:20 dzahn@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 16:20 dzahn@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 16:19 dzahn@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 16:19 dzahn@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 16:12 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 16:11 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 16:10 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 16:09 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 16:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 16:09 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 16:08 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 16:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 16:07 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 16:06 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 16:05 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 16:04 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 16:04 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 16:04 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 16:03 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 16:02 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 16:02 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 16:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 16:01 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 15:58 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 15:57 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 15:57 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 15:56 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 15:52 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudgw2002-dev.codfw.wmnet
* 15:52 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:52 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2002-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 15:47 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2002-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 15:43 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 15:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:40 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:36 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudgw2002-dev.codfw.wmnet
* 15:35 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 15:33 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 15:27 ebernhardson@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:26 ebernhardson@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:19 moritzm: reuploadd libxml2 2.9.10+dfsg-6.7+deb11u9+wmf11u1 and 72.1-3+deb12u1~wmf11u1 to component/php83-icu72 for bullseye-wikimedia [[phab:T419058|T419058]]
* 15:14 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:13 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:13 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=tcp-proxy4004.ulsfo.wmnet
* 15:13 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=tcp-proxy4004.ulsfo.wmnet
* 15:12 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=tcp-proxy4003.ulsfo.wmnet
* 15:12 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=tcp-proxy4003.ulsfo.wmnet
* 15:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 15:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:57 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:56 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:45 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:44 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:34 andrew@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 14:33 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:31 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 14:31 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1018
* 14:31 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1018
* 14:25 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:24 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:20 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet
* 14:15 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 24 hosts with reason: Switch BGP bounce
* 14:12 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet
* 14:09 mlitn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251034{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251035{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251036{{!}}Remove queueing logic (T419587)]], [[gerrit:1251037{{!}}Remove queueing logic (T419587)]] (duration: 07m 15s)
* 14:08 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 14:07 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:07 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:05 mlitn@deploy2002: mlitn: Continuing with sync
* 14:04 mlitn@deploy2002: mlitn: Backport for [[gerrit:1251034{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251035{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251036{{!}}Remove queueing logic (T419587)]], [[gerrit:1251037{{!}}Remove queueing logic (T419587)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:03 XioNoX: start eqiad rack D2 depools
* 14:02 mlitn@deploy2002: Started scap sync-world: Backport for [[gerrit:1251034{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251035{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251036{{!}}Remove queueing logic (T419587)]], [[gerrit:1251037{{!}}Remove queueing logic (T419587)]]
* 13:59 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:59 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:57 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:54 moritzm: installing libssh security updates
* 13:54 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:45 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251031{{!}}ext.testKitchen: Depend on mediawiki.user module]], [[gerrit:1251048{{!}}Add title to the request context in FlaggedRevsCacheTest (T419539)]], [[gerrit:1251032{{!}}ext.testKitchen: Depend on mediawiki.user module]] (duration: 08m 01s)
* 13:42 phuedx@deploy2002: phuedx: Continuing with sync
* 13:39 phuedx@deploy2002: phuedx: Backport for [[gerrit:1251031{{!}}ext.testKitchen: Depend on mediawiki.user module]], [[gerrit:1251048{{!}}Add title to the request context in FlaggedRevsCacheTest (T419539)]], [[gerrit:1251032{{!}}ext.testKitchen: Depend on mediawiki.user module]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:37 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1251031{{!}}ext.testKitchen: Depend on mediawiki.user module]], [[gerrit:1251048{{!}}Add title to the request context in FlaggedRevsCacheTest (T419539)]], [[gerrit:1251032{{!}}ext.testKitchen: Depend on mediawiki.user module]]
* 13:26 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251005{{!}}Deploy EditCheck suggestion mode at all Wikipedias (T415320)]] (duration: 06m 42s)
* 13:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 esanders@deploy2002: esanders: Continuing with sync
* 13:22 esanders@deploy2002: esanders: Backport for [[gerrit:1251005{{!}}Deploy EditCheck suggestion mode at all Wikipedias (T415320)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1178.eqiad.wmnet
* 13:21 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:20 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1251005{{!}}Deploy EditCheck suggestion mode at all Wikipedias (T415320)]]
* 13:18 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250656{{!}}Add multilingual revert risk host header for LiftWing requests (T419718)]] (duration: 10m 52s)
* 13:14 fnegri@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.add-wiki (exit_code=99) for database kaiwiki ([[phab:T414240|T414240]])
* 13:14 fnegri@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database kaiwiki ([[phab:T414240|T414240]])
* 13:14 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 13:14 kgraessle@deploy2002: kgraessle: Continuing with sync
* 13:12 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 13:11 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1250656{{!}}Add multilingual revert risk host header for LiftWing requests (T419718)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 13:07 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1250656{{!}}Add multilingual revert risk host header for LiftWing requests (T419718)]]
* 13:05 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1159.eqiad.wmnet
* 13:03 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 13:02 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 13:02 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:02 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:57 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1159.eqiad.wmnet
* 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1013.eqiad.wmnet
* 12:49 dpogorzelski@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: sync
* 12:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1013.eqiad.wmnet
* 12:49 dpogorzelski@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: sync
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:31 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum4004.ulsfo.wmnet
* 12:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4004.ulsfo.wmnet
* 12:28 moritzm: installing postgresql-17 security updates
* 12:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4004.ulsfo.wmnet
* 12:14 moritzm: installing wireshark security updates
* 12:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1013.eqiad.wmnet with reason: host reimage
* 12:07 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1013.eqiad.wmnet with reason: host reimage
* 11:52 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:51 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:50 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host tcp-proxy4004.ulsfo.wmnet
* 11:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy4004.ulsfo.wmnet with OS trixie
* 11:49 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy4004.ulsfo.wmnet with reason: host reimage
* 11:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy4004.ulsfo.wmnet with reason: host reimage
* 11:19 jayme: disabled puppet on all wikikube worker nodes to rollout/test new apparmor profiles in staging - [[phab:T419781|T419781]]
* 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy4004.ulsfo.wmnet with OS trixie
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy4004.ulsfo.wmnet on all recursors
* 11:06 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy4004.ulsfo.wmnet on all recursors
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 11:03 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:00 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 10:42 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device asw1-23-ulsfo
* 10:41 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 10:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1013.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 10:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4003.ulsfo.wmnet
* 10:31 vgutierrez@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4001.ulsfo.wmnet
* 10:31 vgutierrez@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4002.ulsfo.wmnet
* 10:31 vgutierrez@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4004.ulsfo.wmnet
* 10:30 vgutierrez@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4003.ulsfo.wmnet
* 10:30 vgutierrez: repooling ncredir4003 & ncredir4004
* 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4003.ulsfo.wmnet
* 10:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy4004.ulsfo.wmnet
* 10:26 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1013.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 10:26 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:25 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1013
* 10:22 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1013
* 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host tcp-proxy4003.ulsfo.wmnet
* 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy4003.ulsfo.wmnet with OS trixie
* 10:12 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1011.eqiad.wmnet
* 10:12 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:11 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:11 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:10 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:09 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1011.eqiad.wmnet
* 10:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1010.eqiad.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy4003.ulsfo.wmnet with reason: host reimage
* 10:00 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1010.eqiad.wmnet
* 09:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy4003.ulsfo.wmnet with reason: host reimage
* 09:48 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/SERVICE_NAME: apply
* 09:48 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/SERVICE_NAME: apply
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2024.codfw.wmnet
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2023.codfw.wmnet
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2022.codfw.wmnet
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2021.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2024.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2023.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2022.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2021.codfw.wmnet
* 09:39 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 09:39 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 09:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 09:39 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 09:38 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 09:38 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 09:35 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>ms-fe[2009-2020].codfw.wmnet<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 09:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host durum4004.ulsfo.wmnet
* 09:32 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 09:32 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy4003.ulsfo.wmnet with OS trixie
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy4003.ulsfo.wmnet on all recursors
* 09:30 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy4003.ulsfo.wmnet on all recursors
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:28 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P<nowiki>{</nowiki>ms-fe[2009-2020].codfw.wmnet<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 09:28 Emperor: roll-restart codfw ms frontends prior to pooling new ones [[phab:T416243|T416243]]
* 09:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum4003.ulsfo.wmnet
* 09:23 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:23 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy4003.ulsfo.wmnet
* 09:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host durum4003.ulsfo.wmnet
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts netflow4002.ulsfo.wmnet
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:51 slyngshede@dns1004: END - running authdns-update
* 08:50 slyngshede@dns1004: START - running authdns-update
* 08:37 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:32 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts netflow4002.ulsfo.wmnet
* 08:25 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 08:23 arnaudb@dns1004: END - running authdns-update
* 08:21 arnaudb@dns1004: START - running authdns-update
* 07:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir4004.ulsfo.wmnet
* 07:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncredir4004.ulsfo.wmnet
* 07:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir4003.ulsfo.wmnet
* 07:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncredir4003.ulsfo.wmnet
* 05:24 kart_: staging: machinetranslation: Optimize model loading and memory footprints ([[phab:T411058|T411058]])
* 05:19 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
* 05:16 kartik@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
* 02:16 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2005.codfw.wmnet with OS trixie
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 14s)
* 02:03 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:59 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
* 01:52 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
* 01:49 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:47 swfrench-wmf: reprepro include php-apcu_5.1.24-1+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:37 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2005.codfw.wmnet with OS trixie
* 01:36 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2004.codfw.wmnet with OS trixie
* 01:24 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7012.*
* 01:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 01:18 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
* 01:18 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 01:15 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
* 01:13 swfrench-wmf: reprepro include dh-php_5.5+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:08 swfrench-wmf: reprepro include php-defaults_94+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:05 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 01:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 01:03 swfrench-wmf: reprepro include php8.3_8.3.30-1+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:00 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2004.codfw.wmnet with OS trixie
* 00:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7012.magru.wmnet with OS trixie
* 00:59 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
* 00:58 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
* 00:38 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 00:38 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 00:37 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 00:37 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 00:36 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 00:36 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 00:33 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 00:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7012.magru.wmnet with reason: host reimage
* 00:27 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 00:24 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7012.magru.wmnet with reason: host reimage
* 00:03 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7012.magru.wmnet with OS trixie
== 2026-03-11 ==
* 23:56 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7009.*
* 22:52 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:52 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:45 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 22:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 22:29 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 22:29 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 22:27 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7009.magru.wmnet with OS trixie
* 21:56 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 21:55 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 21:54 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250051{{!}}OrchestratorRequest: Switch evaluations to v2 endpoint (T413727)]] (duration: 18m 19s)
* 21:47 jforrester@deploy2002: jforrester: Continuing with sync
* 21:43 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7009.magru.wmnet with reason: host reimage
* 21:42 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:40 jforrester@deploy2002: jforrester: Backport for [[gerrit:1250051{{!}}OrchestratorRequest: Switch evaluations to v2 endpoint (T413727)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:39 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7009.magru.wmnet with reason: host reimage
* 21:35 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1250051{{!}}OrchestratorRequest: Switch evaluations to v2 endpoint (T413727)]]
* 21:30 rzl: rzl@apt1002:~$ sudo -i reprepro -C component/envoy-future include bullseye-wikimedia /home/rzl/envoyproxy_1.35.9-1_amd64.changes
* 21:29 arlolra@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250665{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]], [[gerrit:1250666{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]] (duration: 35m 16s)
* 21:16 arlolra@deploy2002: arlolra: Continuing with sync
* 21:15 arlolra@deploy2002: arlolra: Backport for [[gerrit:1250665{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]], [[gerrit:1250666{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:08 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7009.magru.wmnet with OS trixie
* 21:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7010.*
* 21:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7010.magru.wmnet with OS trixie
* 20:54 arlolra@deploy2002: Started scap sync-world: Backport for [[gerrit:1250665{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]], [[gerrit:1250666{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]]
* 20:47 jsn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250579{{!}}urwikisource: add logo, sitename and projectnamespace (T415974)]] (duration: 06m 55s)
* 20:43 jsn@deploy2002: anzx, jsn: Continuing with sync
* 20:42 jsn@deploy2002: anzx, jsn: Backport for [[gerrit:1250579{{!}}urwikisource: add logo, sitename and projectnamespace (T415974)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:40 jsn@deploy2002: Started scap sync-world: Backport for [[gerrit:1250579{{!}}urwikisource: add logo, sitename and projectnamespace (T415974)]]
* 20:38 jsn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250581{{!}}riskyArticleEdits: show page descriptions (T419442)]], [[gerrit:1250582{{!}}Fix Instrumentation on mobile view (T419517)]], [[gerrit:1250632{{!}}ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570)]] (duration: 10m 37s)
* 20:38 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ml-serve1014.eqiad.wmnet with reason: [[phab:T400626|T400626]]
* 20:37 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7010.magru.wmnet with reason: host reimage
* 20:34 jsn@deploy2002: jsn, sfaci: Continuing with sync
* 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search-test: apply
* 20:33 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search-test: apply
* 20:32 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7010.magru.wmnet with reason: host reimage
* 20:30 jsn@deploy2002: jsn, sfaci: Backport for [[gerrit:1250581{{!}}riskyArticleEdits: show page descriptions (T419442)]], [[gerrit:1250582{{!}}Fix Instrumentation on mobile view (T419517)]], [[gerrit:1250632{{!}}ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:28 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on gitlab1003.wikimedia.org with reason: Upgrade
* 20:28 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on gitlab2002.wikimedia.org with reason: Upgrade
* 20:27 jsn@deploy2002: Started scap sync-world: Backport for [[gerrit:1250581{{!}}riskyArticleEdits: show page descriptions (T419442)]], [[gerrit:1250582{{!}}Fix Instrumentation on mobile view (T419517)]], [[gerrit:1250632{{!}}ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570)]]
* 20:21 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:18 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 20:17 bvibber@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250647{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]], [[gerrit:1250648{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]] (duration: 06m 47s)
* 20:13 bvibber@deploy2002: bvibber: Continuing with sync
* 20:12 bvibber@deploy2002: bvibber: Backport for [[gerrit:1250647{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]], [[gerrit:1250648{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 bvibber@deploy2002: Started scap sync-world: Backport for [[gerrit:1250647{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]], [[gerrit:1250648{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]]
* 19:59 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7010.magru.wmnet with OS trixie
* 19:54 andrew@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:51 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 19:37 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-backup1004.eqiad.wmnet with OS trixie
* 19:01 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp7011.magru.wmnet
* 19:01 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7011.magru.wmnet
* 18:56 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough
* 18:49 brennen@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 18:49 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:49 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:45 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:45 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:44 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:44 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:43 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough
* 18:42 brennen: 1.46.0-wmf.19 train status: no current blockers, going ahead to group1.
* 18:39 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2332.codfw.wmnet
* 18:37 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2332.codfw.wmnet
* 18:20 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7011.*
* 18:18 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 18:16 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-backup1004.eqiad.wmnet with OS trixie
* 18:13 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 17:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1010.eqiad.wmnet with reason: host reimage
* 17:52 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:52 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:48 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:47 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1010.eqiad.wmnet with reason: host reimage
* 17:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 17:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 17:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:38 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:38 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:36 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:36 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:36 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:35 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 17:34 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 17:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
* 17:31 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
* 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:28 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:28 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:20 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:19 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:19 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:18 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:15 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:13 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:12 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:09 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:09 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:02 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7011.magru.wmnet with OS trixie
* 17:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 17:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 16:58 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum4004.ulsfo.wmnet with reason: in setup
* 16:58 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum4003.ulsfo.wmnet with reason: in setup
* 16:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 16:40 root@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:40 root@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moving many things from cloudgw2002-dev to cloudgw2004-dev - root@cumin2002"
* 16:40 root@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moving many things from cloudgw2002-dev to cloudgw2004-dev - root@cumin2002"
* 16:39 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 16:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 16:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7011.magru.wmnet with reason: host reimage
* 16:35 root@cumin2002: START - Cookbook sre.dns.netbox
* 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus4002.ulsfo.wmnet
* 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - tappof@cumin1003"
* 16:30 tappof@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - tappof@cumin1003"
* 16:30 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7011.magru.wmnet with reason: host reimage
* 16:25 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 16:23 tappof@cumin1003: START - Cookbook sre.dns.netbox
* 16:18 tappof@cumin1003: START - Cookbook sre.hosts.decommission for hosts prometheus4002.ulsfo.wmnet
* 15:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:52 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7011.magru.wmnet with OS trixie
* 15:51 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 15:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 15:50 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:49 urbanecm@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:48 sukhe: sudo cumin -b1 -s10 "C:dnsrecursor" "run-puppet-agent --enable 'merging CR 1250576'"
* 15:48 urbanecm@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:46 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:45 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 15:43 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:39 sukhe: sudo cumin "C:dnsrecursor" "disable-puppet 'merging CR 1250576'"
* 15:35 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:26 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:08 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 15:08 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 15:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:53 swfrench-wmf: updated component/php83-icu72 with libpcre2 10.42-1~wmf11+1 from apt-staging - [[phab:T419058|T419058]]
* 14:46 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:45 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum4004.ulsfo.wmnet
* 14:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum4004.ulsfo.wmnet with OS trixie
* 14:39 vgutierrez: depool ncredir4003 && ncredir4004
* 14:38 vgutierrez: repool ncredir4001 && ncredir4002
* 14:31 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4002.ulsfo.wmnet
* 14:31 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4001.ulsfo.wmnet
* 14:30 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4004.ulsfo.wmnet
* 14:30 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=ncredir4004.ulsfo.wmnet
* 14:27 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4003.ulsfo.wmnet
* 14:27 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=ncredir4003.ulsfo.wmnet
* 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4004.ulsfo.wmnet with reason: host reimage
* 14:23 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:23 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:19 moritzm: installing python-urllib3 security updates
* 14:18 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum4004.ulsfo.wmnet with reason: host reimage
* 14:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:13 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:12 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:12 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:12 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:12 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:11 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:11 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:11 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:08 gkyziridis@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:08 gkyziridis@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:07 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250568{{!}}Fix pinnableElement export (T419620)]] (duration: 06m 26s)
* 14:06 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:04 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:04 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:03 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:03 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 14:03 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:02 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1250568{{!}}Fix pinnableElement export (T419620)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:00 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1250568{{!}}Fix pinnableElement export (T419620)]]
* 13:58 moritzm: uploaded libxml2 2.9.10+dfsg-6.7+deb11u9+wmf11u1 to component/php83-icu72 for bullseye-wikimedia (special build of libxml with ICU disabled to ensure co-installabiliy between icu 67 and icu 72) [[phab:T419058|T419058]]
* 13:57 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250566{{!}}Restore advanced main menu for AMC (T413912)]] (duration: 10m 44s)
* 13:55 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum4004.ulsfo.wmnet with OS trixie
* 13:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:54 vgutierrez: repool cp7016
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum4004.ulsfo.wmnet on all recursors
* 13:54 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum4004.ulsfo.wmnet on all recursors
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:51 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 13:50 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1250566{{!}}Restore advanced main menu for AMC (T413912)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:49 vgutierrez: depool cp7016
* 13:49 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:46 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1250566{{!}}Restore advanced main menu for AMC (T413912)]]
* 13:45 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:44 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:44 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247547{{!}}Remove `MetricsPlatform` configuration from production (T416865)]] (duration: 35m 52s)
* 13:43 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 13:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4004.ulsfo.wmnet with OS bookworm
* 13:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:36 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum4004.ulsfo.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum4003.ulsfo.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum4003.ulsfo.wmnet with OS trixie
* 13:36 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 13:35 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 13:30 jdlrobson@deploy2002: jdlrobson, sfaci: Continuing with sync
* 13:29 jdlrobson@deploy2002: jdlrobson, sfaci: Backport for [[gerrit:1247547{{!}}Remove `MetricsPlatform` configuration from production (T416865)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4003.ulsfo.wmnet with reason: host reimage
* 13:18 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 13:13 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum4003.ulsfo.wmnet with reason: host reimage
* 13:08 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1247547{{!}}Remove `MetricsPlatform` configuration from production (T416865)]]
* 13:00 moritzm: installing libcommons-lang3-java security updates
* 12:57 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4004.ulsfo.wmnet with OS bookworm
* 12:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4003.ulsfo.wmnet with OS bookworm
* 12:46 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum4003.ulsfo.wmnet with OS trixie
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum4003.ulsfo.wmnet on all recursors
* 12:45 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum4003.ulsfo.wmnet on all recursors
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:41 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:37 moritzm: installing inetutils security updates
* 12:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:36 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum4003.ulsfo.wmnet
* 12:35 tappof: completed migration from prometheus4002 to prometheus4003 (ulsfo) (TT419430)
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 12:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 12:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 12:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2073.codfw.wmnet with OS bullseye
* 12:23 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 12:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet
* 12:18 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 12:17 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1011
* 12:17 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1011
* 12:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2072.codfw.wmnet with OS bullseye
* 12:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet
* 12:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2073.codfw.wmnet with reason: host reimage
* 12:04 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4003.ulsfo.wmnet with OS bookworm
* 12:01 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
* 11:59 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
* 11:58 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2073.codfw.wmnet with reason: host reimage
* 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2072.codfw.wmnet with reason: host reimage
* 11:48 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2072.codfw.wmnet with reason: host reimage
* 11:41 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1239954{{!}}[Growth] Enable on every new Wikipedia by default (T304052)]] (duration: 06m 39s)
* 11:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2073
* 11:38 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2073
* 11:37 vgutierrez: upgrading to acme-chief 0.39 on acme-chief production instances - [[phab:T419352|T419352]]
* 11:37 urbanecm@deploy2002: urbanecm: Continuing with sync
* 11:36 urbanecm@deploy2002: urbanecm: Backport for [[gerrit:1239954{{!}}[Growth] Enable on every new Wikipedia by default (T304052)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:36 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2073
* 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2073.codfw.wmnet 212.48.192.10.in-addr.arpa 2.1.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:36 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2073.codfw.wmnet 212.48.192.10.in-addr.arpa 2.1.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2073 - mvernon@cumin2002"
* 11:36 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2073 - mvernon@cumin2002"
* 11:35 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 11:34 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 11:34 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1239954{{!}}[Growth] Enable on every new Wikipedia by default (T304052)]]
* 11:34 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 11:34 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250539{{!}}[Growth] kaiwiki: Enable GrowthExperiments (T304052)]] (duration: 14m 11s)
* 11:33 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 11:33 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 11:32 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 11:32 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:31 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2073
* 11:30 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2073.codfw.wmnet with OS bullseye
* 11:30 urbanecm@deploy2002: urbanecm: Continuing with sync
* 11:29 cgoubert@dns1004: END - running authdns-update
* 11:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2072
* 11:29 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2072
* 11:28 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2072
* 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2072.codfw.wmnet 158.32.192.10.in-addr.arpa 8.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:28 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2072.codfw.wmnet 158.32.192.10.in-addr.arpa 8.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2072 - mvernon@cumin2002"
* 11:28 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2072 - mvernon@cumin2002"
* 11:28 cgoubert@dns1004: START - running authdns-update
* 11:26 urbanecm@deploy2002: mwscript-k8s job started: WikimediaMaintenance:createExtensionTables.php --wiki=kaiwiki growthexperiments # [[phab:T304052|T304052]]
* 11:24 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:24 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2072
* 11:23 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2072.codfw.wmnet with OS bullseye
* 11:22 tappof@dns1004: END - running authdns-update
* 11:22 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 11:21 urbanecm@deploy2002: urbanecm: Backport for [[gerrit:1250539{{!}}[Growth] kaiwiki: Enable GrowthExperiments (T304052)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:21 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 11:21 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
* 11:21 tappof@dns1004: START - running authdns-update
* 11:21 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
* 11:19 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1250539{{!}}[Growth] kaiwiki: Enable GrowthExperiments (T304052)]]
* 11:19 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 11:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2071.codfw.wmnet with OS bullseye
* 11:18 urbanecm@deploy2002: mwscript-k8s job started: WikimediaMaintenance:createExtensionTables.php --wiki=kaiwiki growthexperiments # [[phab:T304052|T304052]]
* 11:10 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
* 11:10 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
* 11:08 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 11:08 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 11:05 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 11:05 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 10:58 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2071.codfw.wmnet with reason: host reimage
* 10:54 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2071.codfw.wmnet with reason: host reimage
* 10:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2071
* 10:35 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2071
* 10:34 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2071
* 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2071.codfw.wmnet 221.16.192.10.in-addr.arpa 1.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:34 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2071.codfw.wmnet 221.16.192.10.in-addr.arpa 1.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2071 - mvernon@cumin2002"
* 10:34 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2071 - mvernon@cumin2002"
* 10:26 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 10:24 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2071
* 10:23 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2071.codfw.wmnet with OS bullseye
* 10:08 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2095.codfw.wmnet with OS bullseye
* 10:03 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Failed step after ml-serve1015's reimage - elukey@cumin1003"
* 10:02 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Failed step after ml-serve1015's reimage - elukey@cumin1003"
* 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1015.eqiad.wmnet with OS trixie
* 10:01 elukey@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 09:59 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2096.codfw.wmnet with OS bullseye
* 09:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2096.codfw.wmnet with OS bullseye
* 09:52 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:51 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2095.codfw.wmnet with OS bullseye
* 09:51 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:46 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 09:34 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 09:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 09:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 09:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 09:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 09:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 09:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 09:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 09:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 09:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 09:28 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 09:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 09:27 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 09:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 09:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 09:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 09:24 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 09:22 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir4004.ulsfo.wmnet
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4004.ulsfo.wmnet with OS bookworm
* 09:15 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:15 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:14 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 09:10 javiermonton@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249217{{!}}stream: mediawiki.page_html_content_change (T419258)]] (duration: 08m 28s)
* 09:07 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2096.codfw.wmnet with OS bullseye
* 09:06 javiermonton@deploy2002: javiermonton: Continuing with sync
* 09:03 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
* 09:03 javiermonton@deploy2002: javiermonton: Backport for [[gerrit:1249217{{!}}stream: mediawiki.page_html_content_change (T419258)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 09:01 javiermonton@deploy2002: Started scap sync-world: Backport for [[gerrit:1249217{{!}}stream: mediawiki.page_html_content_change (T419258)]]
* 08:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1015.eqiad.wmnet with reason: host reimage
* 08:58 trueg@deploy2002: helmfile [staging] DONE helmfile.d/services/SERVICE_NAME: apply
* 08:58 trueg@deploy2002: helmfile [staging] START helmfile.d/services/SERVICE_NAME: apply
* 08:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 08:55 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2239.codfw.wmnet with reason: mysql upgrade / restart
* 08:54 moritzm: installing imagemagick security updates
* 08:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1015.eqiad.wmnet with reason: host reimage
* 08:41 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1015.eqiad.wmnet with OS trixie
* 08:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1014.eqiad.wmnet with OS trixie
* 08:40 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:39 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:35 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4004.ulsfo.wmnet with OS bookworm
* 08:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir4004.ulsfo.wmnet on all recursors
* 08:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir4004.ulsfo.wmnet on all recursors
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1014.eqiad.wmnet with reason: host reimage
* 08:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:21 Msz2001: UTC morning backport window finished
* 08:21 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:21 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir4004.ulsfo.wmnet
* 08:21 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250426{{!}}Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages]] (duration: 10m 46s)
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir4003.ulsfo.wmnet
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4003.ulsfo.wmnet with OS bookworm
* 08:17 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1014.eqiad.wmnet with reason: host reimage
* 08:15 mszwarc@deploy2002: mszwarc: Continuing with sync
* 08:14 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1250426{{!}}Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:10 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1250426{{!}}Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages]]
* 08:09 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249921{{!}}Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422)]], [[gerrit:1250066{{!}}Send2FAWarningNotifications: Support reading users from file (T419111)]] (duration: 33m 07s)
* 08:05 moritzm: installing mariadb bugfix updates from Bookworm point release (tools and libraries as packaged in Debian, unrelated to the wmf-mariadb packages)
* 08:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1014.eqiad.wmnet with OS trixie
* 08:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 07:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 07:57 mszwarc@deploy2002: mszwarc: Continuing with sync
* 07:56 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1249921{{!}}Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422)]], [[gerrit:1250066{{!}}Send2FAWarningNotifications: Support reading users from file (T419111)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:44 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1049.eqiad.wmnet
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 07:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 07:38 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4003.ulsfo.wmnet with OS bookworm
* 07:36 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1249921{{!}}Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422)]], [[gerrit:1250066{{!}}Send2FAWarningNotifications: Support reading users from file (T419111)]]
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir4003.ulsfo.wmnet on all recursors
* 07:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir4003.ulsfo.wmnet on all recursors
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir4003.ulsfo.wmnet
* 07:22 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247639{{!}}Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727)]] (duration: 12m 24s)
* 07:18 kgraessle@deploy2002: kgraessle: Continuing with sync
* 07:12 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1247639{{!}}Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:09 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1247639{{!}}Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 59s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 00:33 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250117{{!}}Stop setting $wgImageLinksSchemaMigrationStage (T299953)]] (duration: 09m 38s)
* 00:29 zabe@deploy2002: zabe: Continuing with sync
* 00:26 zabe@deploy2002: zabe: Backport for [[gerrit:1250117{{!}}Stop setting $wgImageLinksSchemaMigrationStage (T299953)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:24 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1250117{{!}}Stop setting $wgImageLinksSchemaMigrationStage (T299953)]]
* 00:03 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 00:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint1003.wikimedia.org with OS trixie
* 00:03 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:03 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
== 2026-03-10 ==
* 23:58 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 23:53 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 23:49 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 23:44 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint1003.wikimedia.org with reason: host reimage
* 23:40 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on contint1003.wikimedia.org with reason: host reimage
* 23:31 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2096.codfw.wmnet with OS bullseye
* 23:31 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
* 23:26 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2095.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:24 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2096.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:22 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:11 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2096.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:05 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 23:05 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:59 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2095.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:39 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:38 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:51 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp7012.magru.wmnet with OS trixie
* 21:48 Dreamy_Jazz: Evening UTC backport window done
* 21:42 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7006.magru.wmnet [reason: trixie reimaging]
* 21:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7006.magru.wmnet with OS trixie
* 21:25 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1235552{{!}}Migrate EmailAuth, step 2 (T404334)]] (duration: 25m 34s)
* 21:24 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7007.magru.wmnet [reason: trixie reimaging]
* 21:22 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7007.magru.wmnet with OS trixie
* 21:21 tgr@deploy2002: tgr: Continuing with sync
* 21:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 21:09 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 21:02 tgr@deploy2002: tgr: Backport for [[gerrit:1235552{{!}}Migrate EmailAuth, step 2 (T404334)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:00 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1235552{{!}}Migrate EmailAuth, step 2 (T404334)]]
* 20:59 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7012.magru.wmnet with OS trixie
* 20:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7007.magru.wmnet with reason: host reimage
* {{safesubst:SAL entry|1=20:50 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249983{{!}}Deploy participant recruitment survey on ptwiki and trwiki (T419275)]], [[gerrit:1238733{{!}}wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402)]], [[gerrit:1238734{{!}}wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403)]], [[gerrit:1249393{{!}}build: Upgrade mediawiki-phan-config from 0.18.0 to 0.2}}
* 20:48 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7007.magru.wmnet with reason: host reimage
* 20:46 jforrester@deploy2002: dani, jforrester: Continuing with sync
* {{safesubst:SAL entry|1=20:45 jforrester@deploy2002: dani, jforrester: Backport for [[gerrit:1249983{{!}}Deploy participant recruitment survey on ptwiki and trwiki (T419275)]], [[gerrit:1238733{{!}}wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402)]], [[gerrit:1238734{{!}}wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403)]], [[gerrit:1249393{{!}}build: Upgrade mediawiki-phan-config from 0.18.0 to 0.20.0 (T41}}
* 20:43 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7006.magru.wmnet with OS trixie
* {{safesubst:SAL entry|1=20:43 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1249983{{!}}Deploy participant recruitment survey on ptwiki and trwiki (T419275)]], [[gerrit:1238733{{!}}wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402)]], [[gerrit:1238734{{!}}wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403)]], [[gerrit:1249393{{!}}build: Upgrade mediawiki-phan-config from 0.18.0 to 0.20}}
* 20:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7006.magru.wmnet with OS trixie
* 20:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - cdobbins@cumin2002"
* 20:38 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240012{{!}}Enable personal main menu to all users in Minerva Neue skin (T413912)]], [[gerrit:1250007{{!}}Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592)]], [[gerrit:1250015{{!}}Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439)]] (duration: 12m 58s)
* 20:36 cdobbins@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - cdobbins@cumin2002"
* 20:34 jforrester@deploy2002: jforrester, cscott, bwang: Continuing with sync
* 20:27 jforrester@deploy2002: jforrester, cscott, bwang: Backport for [[gerrit:1240012{{!}}Enable personal main menu to all users in Minerva Neue skin (T413912)]], [[gerrit:1250007{{!}}Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592)]], [[gerrit:1250015{{!}}Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439)]] synced to the testservers (see https://wikitech.wi
* 20:25 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1240012{{!}}Enable personal main menu to all users in Minerva Neue skin (T413912)]], [[gerrit:1250007{{!}}Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592)]], [[gerrit:1250015{{!}}Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439)]]
* 20:25 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7007.magru.wmnet with OS trixie
* 20:24 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7007.magru.wmnet [reason: trixie reimaging]
* 20:24 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7005.magru.wmnet [reason: trixie reimaging]
* 20:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7005.magru.wmnet with OS trixie
* 20:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 20:03 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7013.*
* 20:03 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 19:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7013.magru.wmnet with OS trixie
* 19:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7005.magru.wmnet with reason: host reimage
* 19:42 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7005.magru.wmnet with reason: host reimage
* 19:40 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7006.magru.wmnet with OS trixie
* 19:40 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7006.magru.wmnet [reason: trixie reimaging]
* 19:39 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7004.magru.wmnet [reason: trixie reimaging]
* 19:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7013.magru.wmnet with reason: host reimage
* 19:19 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7005.magru.wmnet with OS trixie
* 19:19 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7004.magru.wmnet with OS trixie
* 19:19 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7005.magru.wmnet [reason: trixie reimaging]
* 19:18 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7013.magru.wmnet with reason: host reimage
* 19:17 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
* 19:16 brennen@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 19:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7003.magru.wmnet with OS trixie
* 19:09 brennen: 1.46.0-wmf.19 train status: blockers believed resolved, rolling to group0
* 19:07 brennen@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249937{{!}}Re-add correct namespace for translatable pages (T419294)]] (duration: 12m 30s)
* 19:06 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 19:01 brennen@deploy2002: abi, brennen: Continuing with sync
* 18:58 brennen@deploy2002: abi, brennen: Backport for [[gerrit:1249937{{!}}Re-add correct namespace for translatable pages (T419294)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:55 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7013.magru.wmnet with OS trixie
* 18:54 brennen@deploy2002: Started scap sync-world: Backport for [[gerrit:1249937{{!}}Re-add correct namespace for translatable pages (T419294)]]
* 18:52 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7004.magru.wmnet with reason: host reimage
* 18:52 brennen@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.19 refs [[phab:T413810|T413810]] (duration: 38m 34s)
* 18:49 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7004.magru.wmnet with reason: host reimage
* 18:47 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7003.magru.wmnet with reason: host reimage
* 18:44 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7003.magru.wmnet with reason: host reimage
* 18:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7015.*
* 18:27 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7015.magru.wmnet with OS trixie
* 18:23 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7004.magru.wmnet with OS trixie
* 18:21 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7004.magru.wmnet [reason: trixie reimaging]
* 18:16 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7003.magru.wmnet with OS trixie
* 18:13 brennen@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
* 18:00 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:59 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7015.magru.wmnet with reason: host reimage
* 17:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 17:59 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 17:56 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7015.magru.wmnet with reason: host reimage
* 17:54 hashar@deploy2002: Finished deploy [integration/docroot@f544f49]: Catch up with composer/npm dev dependencies. Noop for production (duration: 00m 11s)
* 17:54 hashar@deploy2002: Started deploy [integration/docroot@f544f49]: Catch up with composer/npm dev dependencies. Noop for production
* 17:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 17:43 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 17:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:32 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:31 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:30 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7015.magru.wmnet with OS trixie
* 17:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:29 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:28 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:28 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 17:28 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:26 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:23 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 17:22 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:21 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:12 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:12 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:11 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:11 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:09 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:09 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:01 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 16:40 andrew@dns1004: END - running authdns-update
* 16:38 andrew@dns1004: START - running authdns-update
* 16:25 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249993{{!}}Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled"]] (duration: 07m 45s)
* 16:21 reedy@deploy2002: reedy: Continuing with sync
* 16:19 reedy@deploy2002: reedy: Backport for [[gerrit:1249993{{!}}Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:17 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1249993{{!}}Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled"]]
* 15:59 jynus@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 15:59 taavi: update cr firewall policy for codfw1dev ldap tree https://gerrit.wikimedia.org/r/1249985
* 15:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-fr-tech: apply
* 15:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-fr-tech: apply
* 15:55 jynus@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 15:48 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:28 brouberol@dns1004: END - running authdns-update
* 15:27 brouberol@dns1004: START - running authdns-update
* 15:10 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002"
* 15:10 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002
* 15:09 swfrench@cumin2002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002
* 15:09 swfrench@cumin2002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002"
* 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:59 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:58 sukhe: sudo cumin -b1 -s15 "C:bird" "run-puppet-agent --enable 'merging CR 1238007; add function return type'"
* 14:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:58 sukhe: sudo cumin -b1 -s15 "C:bird" "run-puppet-agent 'merging CR 1238007; add function return type'"
* 14:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:42 sukhe: sudo cumin "C:bird" "disable-puppet 'merging CR 1238007; add function return type'"
* 14:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host ml-serve1014
* 14:39 elukey@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:36 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host ml-serve1014
* 14:36 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.powercycle (exit_code=99) for host ml-serve1014
* 14:36 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host ml-serve1014
* 14:12 otto@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249367{{!}}stream: mediawiki.page_edit_type_simple.dev0 (T351225)]] (duration: 11m 05s)
* 14:08 otto@deploy2002: akhatun, otto: Continuing with sync
* 14:02 otto@deploy2002: akhatun, otto: Backport for [[gerrit:1249367{{!}}stream: mediawiki.page_edit_type_simple.dev0 (T351225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:01 otto@deploy2002: Started scap sync-world: Backport for [[gerrit:1249367{{!}}stream: mediawiki.page_edit_type_simple.dev0 (T351225)]]
* 13:49 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 13:43 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:28 vgutierrez: testing acme-chief 0.39 in acmechief-test2001 - [[phab:T419352|T419352]]
* 13:27 vgutierrez: upload acme-chief 0.39 to bookworm-wikimedia (apt.wm.o) - [[phab:T419352|T419352]]
* 13:16 jiji@cumin1003: END (FAIL) - Cookbook sre.memcached.roll-reboot-restart (exit_code=1) rolling restart_daemons on A:memcached-canary
* 13:16 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling restart_daemons on A:memcached-canary
* 13:12 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249903{{!}}Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580)]], [[gerrit:1249035{{!}}kaiwiki: add logo, stiename, projectnamespace and timezone (T414237)]] (duration: 08m 45s)
* 13:08 mszwarc@deploy2002: mszwarc, anzx: Continuing with sync
* 13:05 mszwarc@deploy2002: mszwarc, anzx: Backport for [[gerrit:1249903{{!}}Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580)]], [[gerrit:1249035{{!}}kaiwiki: add logo, stiename, projectnamespace and timezone (T414237)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1249903{{!}}Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580)]], [[gerrit:1249035{{!}}kaiwiki: add logo, stiename, projectnamespace and timezone (T414237)]]
* 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 12:57 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1015.eqiad.wmnet with OS bookworm
* 12:56 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 12:51 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1014.eqiad.wmnet with OS bookworm
* 12:50 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ml-serve1014
* 12:50 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ml-serve1014
* 12:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:49 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:49 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:47 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:45 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:44 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:42 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling restart_daemons on A:memcached-canary
* 12:42 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling restart_daemons on A:memcached-canary
* 12:31 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 12:10 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 12:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 11:59 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 11:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe2024.codfw.wmnet with OS bullseye
* 11:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1003"
* 11:17 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1003"
* 11:15 Emperor: rebalance codfw swift rings [[phab:T354872|T354872]]
* 10:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe2024.codfw.wmnet with reason: host reimage
* 10:47 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe2024.codfw.wmnet with reason: host reimage
* 10:31 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host ms-fe2024.codfw.wmnet with OS bullseye
* 10:30 ayounsi@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-fe2024.codfw.wmnet with OS bullseye
* 10:20 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host ms-fe2024.codfw.wmnet with OS bullseye
* 10:17 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cr2-eqdfw
* 09:31 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device cr2-eqdfw
* 09:22 derick@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=loginwiki --logwiki=metawiki TMPRI1975 FondueFanatic # [[phab:T419499|T419499]]
* 09:00 arnaudb@dns1005: END - running authdns-update
* 09:00 godog: restore all host interfaces - [[phab:T417393|T417393]]
* 08:58 arnaudb@dns1005: START - running authdns-update
* 08:30 godog: disabled interface for cloudcephmon1004 - [[phab:T417393|T417393]]
* 08:22 godog: disabled interfaces for cloudcephosd1021 cloudcephosd1042 cloudcephosd1043 cloudcephosd1018 cloudcephosd1022 - [[phab:T417393|T417393]]
* 08:18 godog: disabled interfaces for cloudcephosd1016 cloudcephosd1017 cloudcephosd1016 cloudcephosd1018 cloudcephosd1017 cloudcephosd1035 - [[phab:T417393|T417393]]
* 08:05 godog: start disabling cloudcephosd interfaces - [[phab:T417393|T417393]]
* 07:49 godog: prep cloudsw reboot tests 'ceph osd set noout' - [[phab:T417393|T417393]]
* 07:41 filippo@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 19 hosts with reason: switch down tests
* 06:14 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs2009.codfw.wmnet with OS bookworm
* 04:09 pt1979@cumin2002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device asw1-23-ulsfo
* 04:08 pt1979@cumin2002: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.16 (duration: 01m 48s)
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 10s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:37 ryankemper: [WDQS] [[phab:T410573|T410573]] repooled wdqs1011.eqiad.wmnet - erroneously depooled since `2025-11-19` by failed `sre.wdqs.reboot` cookbook
* 00:42 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 00:39 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
== 2026-03-09 ==
* 22:51 rzl: root@apt1002:~# reprepro --noskipold --restrict vopsbot update bookworm-wikimedia
* 22:34 bking@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM dse-k8s-ctrl1001.eqiad.wmnet
* 22:32 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1001.eqiad.wmnet
* 22:30 bking@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:29 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:03 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 22:02 alexsanford: Redeployed security fix for [[phab:T419186|T419186]]
* 21:44 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:40 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:37 cdobbins@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7002.magru.wmnet
* 21:34 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7002.magru.wmnet with OS trixie
* 21:29 alexsanford: Deployed security fix for [[phab:T419186|T419186]]
* 21:22 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 21:21 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 21:17 dani@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249370{{!}}Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]] (duration: 08m 15s)
* 21:13 dani@deploy2002: dani: Continuing with sync
* 21:11 dani@deploy2002: dani: Backport for [[gerrit:1249370{{!}}Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:09 dani@deploy2002: Started scap sync-world: Backport for [[gerrit:1249370{{!}}Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]]
* 21:08 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:05 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cp7002.magru.wmnet with reason: host reimage
* 21:02 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7002.magru.wmnet with reason: host reimage
* 21:01 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:01 tgr_: removed private code for [[phab:T397244|T397244]]
* 21:01 ryankemper: [WDQS] Alright, these are re-entering a failed state soon enough that we will need to identify the offender if we want to restore proper service. We could put some temporary hack to restart every few minutes so we at least maintain some uptime, but root cause is the usual 'we need a requestctl rule to block whoever's killing us' scenario
* 21:00 cdobbins@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7001.magru.wmnet [reason: Trixie reimaging]
* 20:57 ryankemper: [WDQS] Auto-remediation would have eventually restarted these, but some of them were staying below our current threshold of `threads > 1200`. May want to lower threshold, or examine an additional metric-type to look at in the future
* 20:56 ryankemper: [WDQS] `ryankemper@cumin2002:~$ sudo -E cumin 'A:wdqs-main AND P<nowiki>{</nowiki>wdqs1*<nowiki>}</nowiki>' 'systemctl restart wdqs-blazegraph'`
* 20:54 ryankemper: [WDQS] `ryankemper@cumin2002:~$ sudo -E cumin 'A:wdqs-main AND P<nowiki>{</nowiki>wdqs2*<nowiki>}</nowiki>' 'systemctl restart wdqs-blazegraph'`
* 20:44 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 20:43 tgr@deploy2002: Unlocked for deployment [MediaWiki]: working on private change (duration: 10m 10s)
* 20:36 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7002.magru.wmnet with OS trixie
* 20:33 tgr@deploy2002: Locking from deployment [MediaWiki]: working on private change
* 20:31 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247119{{!}}Enable parser survey for opted-out users on German/French/Polish wikis (T414852)]], [[gerrit:1249316{{!}}lift IP cap for womens month editathon (T419109)]] (duration: 13m 36s)
* 20:27 tgr@deploy2002: cscott, tgr, anzx: Continuing with sync
* 20:19 tgr@deploy2002: cscott, tgr, anzx: Backport for [[gerrit:1247119{{!}}Enable parser survey for opted-out users on German/French/Polish wikis (T414852)]], [[gerrit:1249316{{!}}lift IP cap for womens month editathon (T419109)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:17 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1247119{{!}}Enable parser survey for opted-out users on German/French/Polish wikis (T414852)]], [[gerrit:1249316{{!}}lift IP cap for womens month editathon (T419109)]]
* 20:13 aaron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249363{{!}}Remove redundant math spec file from wwwportal (T418188)]] (duration: 06m 56s)
* 20:09 aaron@deploy2002: aaron: Continuing with sync
* 20:08 aaron@deploy2002: aaron: Backport for [[gerrit:1249363{{!}}Remove redundant math spec file from wwwportal (T418188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:06 aaron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249363{{!}}Remove redundant math spec file from wwwportal (T418188)]]
* 20:01 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7016.*
* 19:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7001.magru.wmnet with OS trixie
* 19:51 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7016.magru.wmnet with OS trixie
* 19:49 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248911{{!}}Stop writing to il_to on commonswiki (T415787)]] (duration: 06m 04s)
* 19:45 zabe@deploy2002: zabe: Continuing with sync
* 19:44 zabe@deploy2002: zabe: Backport for [[gerrit:1248911{{!}}Stop writing to il_to on commonswiki (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:43 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248911{{!}}Stop writing to il_to on commonswiki (T415787)]]
* 19:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 19:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 19:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7001.magru.wmnet with reason: host reimage
* 19:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7016.magru.wmnet with reason: host reimage
* 19:23 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7001.magru.wmnet with reason: host reimage
* 19:19 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7016.magru.wmnet with reason: host reimage
* 19:15 cwhite@deploy2002: Finished deploy [performance/arc-lamp@aa8da8b]: {{Gerrit|Ie7e0355f89294a2927f9dbc28afec3a62d1752de}} (duration: 00m 08s)
* 19:15 cwhite@deploy2002: Started deploy [performance/arc-lamp@aa8da8b]: {{Gerrit|Ie7e0355f89294a2927f9dbc28afec3a62d1752de}}
* 19:14 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 19:14 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 19:05 herron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249365{{!}}udp2log: switch to new hosts]] (duration: 09m 38s)
* 19:03 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:03 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:01 herron@deploy2002: herron: Continuing with sync
* 19:00 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:00 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 18:59 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 18:59 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 18:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 18:59 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 18:57 herron@deploy2002: herron: Backport for [[gerrit:1249365{{!}}udp2log: switch to new hosts]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:57 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7001.magru.wmnet with OS trixie
* 18:55 herron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249365{{!}}udp2log: switch to new hosts]]
* 18:55 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7016.magru.wmnet with OS trixie
* 18:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 18:49 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 18:44 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 18:44 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 18:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:34 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:33 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 18:32 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 18:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:29 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:27 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:23 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 18:23 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 18:23 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 18:16 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 18:16 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 18:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1001.eqiad.wmnet
* 18:05 herron@deploy2002: Sync cancelled.
* 18:04 herron@deploy2002: herron: Backport for [[gerrit:1249361{{!}}Revert "udp2log: switch to new hosts"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:02 herron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249361{{!}}Revert "udp2log: switch to new hosts"]]
* 18:01 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1001.eqiad.wmnet
* 17:54 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:47 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:42 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:42 herron@deploy2002: Sync cancelled.
* 17:40 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:39 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:38 mutante: contint1003 - unable to get uptime Caused by: Cumin execution failed (exit_code=2) [101/240] - attempted manual powercycle - Initializing Firmware Interfaces... blank screen [[phab:T418544|T418544]]
* 17:34 mutante: contint1003.mgmt - racadm serveraction powercycle [[phab:T418544|T418544]] - not reacting
* 17:25 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:25 herron@deploy2002: herron: Backport for [[gerrit:1249332{{!}}udp2log: switch to new hosts (T417002)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:23 herron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249332{{!}}udp2log: switch to new hosts (T417002)]]
* 17:19 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host netflow4003.ulsfo.wmnet
* 17:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host netflow4003.ulsfo.wmnet with OS bookworm
* 17:13 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:08 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 17:03 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 17:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 17:00 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis kaiwiki in section s5
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow4003.ulsfo.wmnet with reason: host reimage
* 16:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on netflow4003.ulsfo.wmnet with reason: host reimage
* 16:37 moritzm: installing gnupg security updates
* 16:31 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host netflow4003.ulsfo.wmnet with OS bookworm
* 16:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netflow4003.ulsfo.wmnet on all recursors
* 16:30 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netflow4003.ulsfo.wmnet on all recursors
* 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:26 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host netflow4003.ulsfo.wmnet
* 16:26 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 15:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on prometheus4003.ulsfo.wmnet with reason: host reimage
* 15:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on prometheus4003.ulsfo.wmnet with reason: host reimage
* 15:44 vgutierrez: vgutierrez@acmechief-test2001:~$ sudo -i systemctl disable reload-acme-chief-backend.timer - [[phab:T419352|T419352]]
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 15:37 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 15:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 15:30 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 15:26 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 15:24 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 15:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
* 15:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
* 15:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 15:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
* 15:08 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 15:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
* 14:50 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bookworm
* 14:49 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs2009.codfw.wmnet with OS bullseye
* 14:45 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 14:35 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249291{{!}}Hide 2fa-warning Echo category from preferences (T419111)]] (duration: 06m 07s)
* 14:35 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis kaiwiki in section s5
* 14:34 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitize-wiki (exit_code=99) Managing sanitization for wikis urwikisource in section s5
* 14:31 mszwarc@deploy2002: mszwarc: Continuing with sync
* 14:31 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1249291{{!}}Hide 2fa-warning Echo category from preferences (T419111)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:30 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
* 14:29 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1249291{{!}}Hide 2fa-warning Echo category from preferences (T419111)]]
* 14:25 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis urwikisource in section s5
* 14:22 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis urwikisource in section s5
* 14:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 14:15 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 14:15 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249243{{!}}JS SDK: Add getExperimentByPrefix() (T419191)]], [[gerrit:1249242{{!}}ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191)]] (duration: 09m 39s)
* 14:11 phuedx@deploy2002: phuedx: Continuing with sync
* 14:07 phuedx@deploy2002: phuedx: Backport for [[gerrit:1249243{{!}}JS SDK: Add getExperimentByPrefix() (T419191)]], [[gerrit:1249242{{!}}ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:05 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1249243{{!}}JS SDK: Add getExperimentByPrefix() (T419191)]], [[gerrit:1249242{{!}}ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191)]]
* 14:03 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 13:54 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bullseye
* 13:50 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249262{{!}}Disable MetricsPlatform extension (T416865)]] (duration: 08m 02s)
* 13:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 13:46 phuedx@deploy2002: phuedx, sfaci: Continuing with sync
* 13:44 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:44 phuedx@deploy2002: phuedx, sfaci: Backport for [[gerrit:1249262{{!}}Disable MetricsPlatform extension (T416865)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1249262{{!}}Disable MetricsPlatform extension (T416865)]]
* 13:39 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248075{{!}}Confirmemail: Log delay between email sent and confirmation (T415902)]], [[gerrit:1247651{{!}}Enable confirmemail logstash channel (T415902)]] (duration: 11m 16s)
* 13:35 phuedx@deploy2002: mmartorana, phuedx: Continuing with sync
* 13:30 phuedx@deploy2002: mmartorana, phuedx: Backport for [[gerrit:1248075{{!}}Confirmemail: Log delay between email sent and confirmation (T415902)]], [[gerrit:1247651{{!}}Enable confirmemail logstash channel (T415902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1248075{{!}}Confirmemail: Log delay between email sent and confirmation (T415902)]], [[gerrit:1247651{{!}}Enable confirmemail logstash channel (T415902)]]
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 13:04 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 12:55 moritzm: installing Kerberos security updates
* 12:29 moritzm: installing python3.9 security updates
* 12:11 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 12:00 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1239026{{!}}Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544)]], [[gerrit:1249253{{!}}CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled]] (duration: 06m 13s)
* 11:56 reedy@deploy2002: reedy: Continuing with sync
* 11:56 reedy@deploy2002: reedy: Backport for [[gerrit:1239026{{!}}Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544)]], [[gerrit:1249253{{!}}CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:54 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1239026{{!}}Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544)]], [[gerrit:1249253{{!}}CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled]]
* 11:44 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249245{{!}}Hooks: Really only add global logging context for pageviews]] (duration: 12m 02s)
* 11:38 phuedx@deploy2002: phuedx: Continuing with sync
* 11:34 phuedx@deploy2002: phuedx: Backport for [[gerrit:1249245{{!}}Hooks: Really only add global logging context for pageviews]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:32 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1249245{{!}}Hooks: Really only add global logging context for pageviews]]
* 11:29 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 11:29 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 11:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 11:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:50 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:49 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:45 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus4003.ulsfo.wmnet
* 10:45 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:45 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:44 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus4003.ulsfo.wmnet on all recursors
* 10:43 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache prometheus4003.ulsfo.wmnet on all recursors
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:40 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:39 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:39 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host prometheus4003.ulsfo.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:33 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:17 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 10:12 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:51 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 09:46 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:43 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:40 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus4003.ulsfo.wmnet
* 09:40 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host prometheus4003.ulsfo.wmnet
* 09:31 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host frdb1008
* 09:31 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host frdb1008
* 09:29 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 09:05 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 08:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 08:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 08:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 08:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:25 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 08:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 08:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 08:21 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 08:16 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:07 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo and group 1
* 08:07 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo and group 1
* 07:37 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248806{{!}}Add a script to send mandatory 2FA Echo notification (T419111)]], [[gerrit:1248821{{!}}Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880)]] (duration: 34m 41s)
* 07:23 mszwarc@deploy2002: mszwarc: Continuing with sync
* 07:22 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1248806{{!}}Add a script to send mandatory 2FA Echo notification (T419111)]], [[gerrit:1248821{{!}}Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:02 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1248806{{!}}Add a script to send mandatory 2FA Echo notification (T419111)]], [[gerrit:1248821{{!}}Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 58s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-08 ==
* 20:28 vgutierrez@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on acmechief-test2001.codfw.wmnet with reason: GTS issues
* 02:01 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 00m 59s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-07 ==
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 23s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:20 krinkle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248952{{!}}CSP: restore toolforge/wmcs entry in false positive list]] (duration: 10m 46s)
* 01:16 krinkle@deploy2002: krinkle: Continuing with sync
* 01:11 krinkle@deploy2002: krinkle: Backport for [[gerrit:1248952{{!}}CSP: restore toolforge/wmcs entry in false positive list]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:09 krinkle@deploy2002: Started scap sync-world: Backport for [[gerrit:1248952{{!}}CSP: restore toolforge/wmcs entry in false positive list]]
* 00:22 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
* 00:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp2043.codfw.wmnet
* 00:05 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp2043.codfw.wmnet
== 2026-03-06 ==
* 23:29 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs2009.codfw.wmnet with OS bullseye
* 23:13 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 23:07 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs2009
* 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2009
* 22:46 ryankemper@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2009
* 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs2009.codfw.wmnet 141.0.192.10.in-addr.arpa 1.4.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:46 ryankemper@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs2009.codfw.wmnet 141.0.192.10.in-addr.arpa 1.4.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:45 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:45 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs2009 - ryankemper@cumin2002"
* 22:45 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs2009 - ryankemper@cumin2002"
* 22:41 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
* 22:40 ryankemper@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs2009
* 22:39 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bullseye
* 19:48 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:47 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:47 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:46 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host wdqs2009.codfw.wmnet
* 19:23 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2009.codfw.wmnet
* 19:17 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on wdqs2009.codfw.wmnet with reason: NFS might be hung, about to reboot
* 18:56 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2043.codfw.wmnet with reason: troubleshooting for network drops
* 18:44 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp2043.*
* 18:29 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts an-backup-datanode1033.eqiad.wmnet
* 18:29 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-backup-datanode1033.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
* 18:28 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-backup-datanode1033.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
* 17:59 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248858{{!}}cirrus: Use https for semanticsearch-test cluster]] (duration: 11m 20s)
* 17:53 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 17:52 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1248858{{!}}cirrus: Use https for semanticsearch-test cluster]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:51 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:51 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:47 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1248858{{!}}cirrus: Use https for semanticsearch-test cluster]]
* 17:42 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:42 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:40 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:40 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:11 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:11 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:10 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 17:05 hashar@deploy2002: Finished deploy [gerrit/gerrit@b8183ba]: wm-checks-api: add tooltip to the CheckRun Run action (duration: 00m 13s)
* 17:05 hashar@deploy2002: Started deploy [gerrit/gerrit@b8183ba]: wm-checks-api: add tooltip to the CheckRun Run action
* 17:04 btullis@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-backup-datanode1033.eqiad.wmnet
* 16:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 16:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 16:23 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 16:23 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 15:57 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:57 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:56 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:56 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:52 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2354-2356].codfw.wmnet
* 15:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2354-2356].codfw.wmnet
* 15:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2356.codfw.wmnet with OS trixie
* 15:46 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2355.codfw.wmnet with OS trixie
* 15:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2354.codfw.wmnet with OS trixie
* 15:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2356.codfw.wmnet with reason: host reimage
* 15:31 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 15:30 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 15:28 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 15:28 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 15:28 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 15:26 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 15:26 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 15:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2355.codfw.wmnet with reason: host reimage
* 15:24 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 15:23 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 15:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2354.codfw.wmnet with reason: host reimage
* 15:19 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 15:19 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 15:17 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 15:17 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 15:17 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2356.codfw.wmnet with reason: host reimage
* 15:16 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2355.codfw.wmnet with reason: host reimage
* 15:16 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2354.codfw.wmnet with reason: host reimage
* 15:15 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 15:10 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 15:09 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 15:08 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 15:08 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 15:06 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 15:05 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 15:05 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 15:05 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2356.codfw.wmnet with OS trixie
* 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2355.codfw.wmnet with OS trixie
* 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2354.codfw.wmnet with OS trixie
* 15:02 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 15:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2348-2353].codfw.wmnet
* 15:02 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 15:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2348-2353].codfw.wmnet
* 14:59 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2353.codfw.wmnet with OS trixie
* 14:57 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:57 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:56 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
* 14:53 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:52 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 14:52 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2351.codfw.wmnet with OS trixie
* 14:49 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 14:48 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2352.codfw.wmnet with OS trixie
* 14:48 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 14:48 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 14:48 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:47 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 14:45 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2350.codfw.wmnet with OS trixie
* 14:44 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 14:43 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:43 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 14:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:41 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2353.codfw.wmnet with reason: host reimage
* 14:37 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2349.codfw.wmnet with reason: host reimage
* 14:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2351.codfw.wmnet with reason: host reimage
* 14:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2352.codfw.wmnet with reason: host reimage
* 14:29 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:28 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2350.codfw.wmnet with reason: host reimage
* 14:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2348.codfw.wmnet with reason: host reimage
* 14:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2351.codfw.wmnet with reason: host reimage
* 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2352.codfw.wmnet with reason: host reimage
* 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2353.codfw.wmnet with reason: host reimage
* 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2350.codfw.wmnet with reason: host reimage
* 14:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2349.codfw.wmnet with reason: host reimage
* 14:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2348.codfw.wmnet with reason: host reimage
* 14:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2353.codfw.wmnet with OS trixie
* 14:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2352.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2351.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2350.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2347].codfw.wmnet
* 14:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2347].codfw.wmnet
* 14:01 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2347.codfw.wmnet with OS trixie
* 13:57 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2346.codfw.wmnet with OS trixie
* 13:55 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2343.codfw.wmnet with OS trixie
* 13:50 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2345.codfw.wmnet with OS trixie
* 13:48 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2344.codfw.wmnet with OS trixie
* 13:45 dreamyjazz@deploy2002: mwscript-k8s job started: foreachwikiindblist checkuser-suggested-investigations CheckUser:queueAutoCloseSICases.php # [[phab:T418591|T418591]]
* 13:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2342.codfw.wmnet with OS trixie
* 13:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2347.codfw.wmnet with reason: host reimage
* 13:38 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2346.codfw.wmnet with reason: host reimage
* 13:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2343.codfw.wmnet with reason: host reimage
* 13:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2345.codfw.wmnet with reason: host reimage
* 13:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2344.codfw.wmnet with reason: host reimage
* 13:24 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2342.codfw.wmnet with reason: host reimage
* 13:21 Dreamy_Jazz: Running foreachwikiindblist checkuser-suggested-investigations.dblist ~/PopulateSiuInfo.php --batch-size=1000 for [[phab:T411118|T411118]]
* 13:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2347.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2346.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2345.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2344.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2343.codfw.wmnet with reason: host reimage
* 13:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2342.codfw.wmnet with reason: host reimage
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2347.codfw.wmnet with OS trixie
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2346.codfw.wmnet with OS trixie
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2345.codfw.wmnet with OS trixie
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2344.codfw.wmnet with OS trixie
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2343.codfw.wmnet with OS trixie
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2342.codfw.wmnet with OS trixie
* 13:05 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2336-2341].codfw.wmnet
* 13:05 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2336-2341].codfw.wmnet
* 13:01 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2341.codfw.wmnet with OS trixie
* 12:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2340.codfw.wmnet with OS trixie
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2337.codfw.wmnet with OS trixie
* 12:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2338.codfw.wmnet with OS trixie
* 12:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2336.codfw.wmnet with OS trixie
* 12:40 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2341.codfw.wmnet with reason: host reimage
* 12:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2339.codfw.wmnet with OS trixie
* 12:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2340.codfw.wmnet with reason: host reimage
* 12:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2337.codfw.wmnet with reason: host reimage
* 12:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2338.codfw.wmnet with reason: host reimage
* 12:22 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2336.codfw.wmnet with reason: host reimage
* 12:18 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2339.codfw.wmnet with reason: host reimage
* 12:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2340.codfw.wmnet with reason: host reimage
* 12:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2341.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2337.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2338.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2336.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2339.codfw.wmnet with reason: host reimage
* 12:00 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2341.codfw.wmnet with OS trixie
* 12:00 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2340.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2339.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2338.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2337.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2336.codfw.wmnet with OS trixie
* 11:56 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2333-2335].codfw.wmnet
* 11:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2333-2335].codfw.wmnet
* 11:55 elukey@cumin1003: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 11:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1207.eqiad.wmnet
* 11:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2335.codfw.wmnet with OS trixie
* 11:53 moritzm: uploaded icu 72.1-3+deb12u1~wmf11u1 to component/php83-icu72 [[phab:T419058|T419058]] (backport of ICU 72 from Bookworm to Bullseye, built to be co-installable with the native ICU from Bullseye)
* 11:50 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2334.codfw.wmnet with OS trixie
* 11:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1207.eqiad.wmnet
* 11:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1205.eqiad.wmnet
* 11:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2333.codfw.wmnet with OS trixie
* 11:39 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 11:39 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1205.eqiad.wmnet
* 11:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2335.codfw.wmnet with reason: host reimage
* 11:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2334.codfw.wmnet with reason: host reimage
* 11:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2333.codfw.wmnet with reason: host reimage
* 11:23 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2335.codfw.wmnet with reason: host reimage
* 11:22 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2334.codfw.wmnet with reason: host reimage
* 11:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2333.codfw.wmnet with reason: host reimage
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2334.codfw.wmnet with OS trixie
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2335.codfw.wmnet with OS trixie
* 11:08 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2333.codfw.wmnet with OS trixie
* 11:06 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2332.codfw.wmnet
* 11:05 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2332.codfw.wmnet
* 11:02 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2332.codfw.wmnet with OS trixie
* 10:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2332.codfw.wmnet with reason: host reimage
* 10:36 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2332.codfw.wmnet with reason: host reimage
* 10:23 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2332.codfw.wmnet with OS trixie
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1199.eqiad.wmnet
* 10:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1194.eqiad.wmnet
* 10:16 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2356].codfw.wmnet
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1194.eqiad.wmnet
* 10:09 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
* 10:09 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2356].codfw.wmnet
* 09:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:39 Emperor: repool ms-fe1013 after PXE work [[phab:T401966|T401966]]
* 09:23 derick@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=pmswiki --logwiki=metawiki Wikilimes Limes.pink # [[phab:T419184|T419184]]
* 09:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:09 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:08 elukey@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 09:08 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe1013.eqiad.wmnet
* 08:57 elukey@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-fe1013.eqiad.wmnet
* 08:56 elukey@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 08:54 elukey@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 08:42 elukey@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 08:25 moritzm: uploaded openjdk-8 8u482-ga-1~deb12u1 to component/jdk8 of bookworm-wikimedia
* 08:11 moritzm: imported prometheus-ganeti-exporter 0.3+deb12u2 for bookworm-wikimedia [[phab:T419166|T419166]]
* 06:23 ryankemper@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
* 06:23 ryankemper@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
* 06:23 ryankemper@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
* 06:23 ryankemper@deploy2002: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
* 06:22 ryankemper@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
* 06:22 ryankemper@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
* 02:59 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:59 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 02:59 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 02:56 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 02:21 zabe: zabe@deploy2002:/srv/mediawiki-staging$ foreachwiki extensions/TimedMediaHandler/maintenance/migrateTranscodeStates.php --force # [[phab:T415064|T415064]]
* 02:16 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248658{{!}}Update interwiki cache]] (duration: 06m 38s)
* 02:12 zabe@deploy2002: mwscript-k8s job started: foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https # [[phab:T415978|T415978]], [[phab:T414241|T414241]]
* 02:12 zabe@deploy2002: zabe: Continuing with sync
* 02:11 zabe@deploy2002: zabe: Backport for [[gerrit:1248658{{!}}Update interwiki cache]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 02:09 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248658{{!}}Update interwiki cache]]
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 23s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:59 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248656{{!}}Set urwikisource to rtl (T415960)]] (duration: 06m 39s)
* 01:55 zabe@deploy2002: zabe: Continuing with sync
* 01:54 zabe@deploy2002: zabe: Backport for [[gerrit:1248656{{!}}Set urwikisource to rtl (T415960)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:53 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248656{{!}}Set urwikisource to rtl (T415960)]]
* 01:45 zabe@deploy2002: Sync cancelled.
* 01:43 zabe@deploy2002: zabe: Backport for [[gerrit:1248653{{!}}Activate urwikisource (T415960)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:42 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248653{{!}}Activate urwikisource (T415960)]]
* 01:38 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248652{{!}}Prepare urwikisource (T415960)]] (duration: 06m 18s)
* 01:34 zabe@deploy2002: zabe: Continuing with sync
* 01:34 zabe@deploy2002: zabe: Backport for [[gerrit:1248652{{!}}Prepare urwikisource (T415960)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:32 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248652{{!}}Prepare urwikisource (T415960)]]
* 01:29 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248650{{!}}Activate kaiwiki (T414234)]] (duration: 06m 57s)
* 01:25 zabe@deploy2002: zabe: Continuing with sync
* 01:24 zabe@deploy2002: zabe: Backport for [[gerrit:1248650{{!}}Activate kaiwiki (T414234)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:22 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248650{{!}}Activate kaiwiki (T414234)]]
* 01:17 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248647{{!}}Prepare kaiwiki (T414234)]] (duration: 07m 25s)
* 01:13 zabe@deploy2002: zabe: Continuing with sync
* 01:11 zabe@deploy2002: zabe: Backport for [[gerrit:1248647{{!}}Prepare kaiwiki (T414234)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:09 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248647{{!}}Prepare kaiwiki (T414234)]]
* 00:33 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248493{{!}}Stop writing to il_to on all wikis except commons (T415787)]] (duration: 06m 22s)
* 00:29 zabe@deploy2002: zabe: Continuing with sync
* 00:28 zabe@deploy2002: zabe: Backport for [[gerrit:1248493{{!}}Stop writing to il_to on all wikis except commons (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:27 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248493{{!}}Stop writing to il_to on all wikis except commons (T415787)]]
* 00:05 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248628{{!}}Re-enable AllowUserJs (T419137)]] (duration: 08m 08s)
* 00:01 catrope@deploy2002: catrope, kharlan: Continuing with sync
== 2026-03-05 ==
* 23:58 catrope@deploy2002: catrope, kharlan: Backport for [[gerrit:1248628{{!}}Re-enable AllowUserJs (T419137)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:56 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1248628{{!}}Re-enable AllowUserJs (T419137)]]
* 23:52 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248636{{!}}CSP: Update false positives list]] (duration: 06m 34s)
* 23:52 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2003.wikimedia.org with OS trixie
* 23:47 catrope@deploy2002: catrope: Continuing with sync
* 23:47 catrope@deploy2002: catrope: Backport for [[gerrit:1248636{{!}}CSP: Update false positives list]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:45 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1248636{{!}}CSP: Update false positives list]]
* 23:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint2003.wikimedia.org with reason: host reimage
* 23:29 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on contint2003.wikimedia.org with reason: host reimage
* 23:15 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1238028{{!}}Using Hadoop for MostTranscludedPages on commonswiki (T416927)]] (duration: 06m 27s)
* 23:11 zabe@deploy2002: zabe: Continuing with sync
* 23:10 zabe@deploy2002: zabe: Backport for [[gerrit:1238028{{!}}Using Hadoop for MostTranscludedPages on commonswiki (T416927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:09 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2003.wikimedia.org with OS trixie
* 23:08 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1238028{{!}}Using Hadoop for MostTranscludedPages on commonswiki (T416927)]]
* 22:45 maryum: Deployed security fix for [[phab:T418254|T418254]]
* 22:35 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248483{{!}}SpecialWantedFiles: Use lt_title instead of lt_to (T299953)]] (duration: 06m 12s)
* 22:31 zabe@deploy2002: zabe: Continuing with sync
* 22:30 zabe@deploy2002: zabe: Backport for [[gerrit:1248483{{!}}SpecialWantedFiles: Use lt_title instead of lt_to (T299953)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:28 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248483{{!}}SpecialWantedFiles: Use lt_title instead of lt_to (T299953)]]
* 21:43 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248508{{!}}cirrus: Align semanticsearch cluster group name with routing (T413969)]] (duration: 07m 20s)
* 21:39 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 21:38 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1248508{{!}}cirrus: Align semanticsearch cluster group name with routing (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:36 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1248508{{!}}cirrus: Align semanticsearch cluster group name with routing (T413969)]]
* 21:04 jhathaway@dns1004: END - running authdns-update
* 21:02 jhathaway@dns1004: START - running authdns-update
* 20:53 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:52 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new service IPs for sophroid - jasmine@cumin2002"
* 20:52 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new service IPs for sophroid - jasmine@cumin2002"
* 20:47 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 20:28 cdanis: apt built and imported jwt-authorizer 1.3.0-1
* 20:16 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 20:04 krinkle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248574{{!}}Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475)]] (duration: 07m 37s)
* 20:00 krinkle@deploy2002: krinkle: Continuing with sync
* 19:58 krinkle@deploy2002: krinkle: Backport for [[gerrit:1248574{{!}}Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:56 krinkle@deploy2002: Started scap sync-world: Backport for [[gerrit:1248574{{!}}Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475)]]
* 19:21 sbassett@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248571{{!}}Re-enable Site JS (T419137 T419138)]] (duration: 06m 57s)
* 19:17 sbassett@deploy2002: sbassett: Continuing with sync
* 19:16 sbassett@deploy2002: sbassett: Backport for [[gerrit:1248571{{!}}Re-enable Site JS (T419137 T419138)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:15 sbassett@deploy2002: Started scap sync-world: Backport for [[gerrit:1248571{{!}}Re-enable Site JS (T419137 T419138)]]
* 19:04 dr0ptp4kt: Deploying change {{Gerrit|1239200}} for refinery ( [[phab:T416481|T416481]] ) using scap, then deployed onto hdfs
* 19:03 dr0ptp4kt: Deployed refinery change {{Gerrit|1240253}} ( [[phab:T414478|T414478]] ), {{Gerrit|1240253}} (no-op) for refinery ( [[phab:T414478|T414478]] ) using scap, then deployed onto hdfs
* 18:58 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1] (thin): Regular analytics weekly train THIN [analytics/refinery@dd641b15] (duration: 02m 02s)
* 18:56 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1] (thin): Regular analytics weekly train THIN [analytics/refinery@dd641b15]
* 18:55 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1]: Regular analytics weekly train [analytics/refinery@dd641b15] (duration: 04m 18s)
* 18:50 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1]: Regular analytics weekly train [analytics/refinery@dd641b15]
* 18:49 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@dd641b15] (duration: 01m 57s)
* 18:47 dr0ptp4kt: Deploying change {{Gerrit|1239200}} for refinery ( [[phab:T416481|T416481]] )
* 18:47 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@dd641b15]
* 18:31 eevans@dns1004: END - running authdns-update
* 18:30 eevans@dns1004: START - running authdns-update
* 18:30 sukhe: sudo cumin -b51 "A:cp" "run-puppet-agent --enable 'rolling out 1248544'"
* 18:16 sukhe: sudo cumin "A:cp" "disable-puppet 'rolling out 1248544'"
* 18:06 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:06 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 18:06 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 18:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 17:31 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248536{{!}}Enable wgUseSiteJs on donatewiki (T419138)]] (duration: 09m 57s)
* 17:27 mszwarc@deploy2002: mszwarc, krinkle: Continuing with sync
* 17:26 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2003.wikimedia.org with OS bookworm
* 17:26 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:23 mszwarc@deploy2002: mszwarc, krinkle: Backport for [[gerrit:1248536{{!}}Enable wgUseSiteJs on donatewiki (T419138)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:21 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1248536{{!}}Enable wgUseSiteJs on donatewiki (T419138)]]
* 17:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1178.eqiad.wmnet
* 17:16 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:12 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1162.eqiad.wmnet
* 17:12 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1162.eqiad.wmnet
* 17:10 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker1162.eqiad.wmnet
* 17:10 cgoubert@cumin1003: START - Cookbook sre.hosts.remove-downtime for wikikube-worker1162.eqiad.wmnet
* 17:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 17:05 taavi@cumin1003: dbctl commit (dc=all): 'enable writes', diff saved to https://phabricator.wikimedia.org/P89812 and previous config saved to /var/cache/conftool/dbconfig/20260305-170556-taavi.json
* 16:03 oblivian@cumin1003: dbctl commit (dc=all): 'read only s6', diff saved to https://phabricator.wikimedia.org/P89810 and previous config saved to /var/cache/conftool/dbconfig/20260305-160348-oblivian.json
* 15:32 taavi@cumin1003: dbctl commit (dc=all): 'set global ro', diff saved to https://phabricator.wikimedia.org/P89808 and previous config saved to /var/cache/conftool/dbconfig/20260305-153203-taavi.json
* 15:31 mszwarc@deploy2002: mszwarc: Continuing with sync
* 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1178.eqiad.wmnet
* 15:31 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1248509{{!}}Disable custom JS for a moment]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:29 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1248509{{!}}Disable custom JS for a moment]]
* 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2003']
* 15:25 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2003']
* 15:23 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248506{{!}}cirrus: Correct semantic builder config (T413969)]] (duration: 07m 39s)
* 15:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:19 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 15:18 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1248506{{!}}cirrus: Correct semantic builder config (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:16 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1248506{{!}}cirrus: Correct semantic builder config (T413969)]]
* 15:11 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 15:10 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244713{{!}}cirrus: Add semantic search test cluster (T413969)]] (duration: 09m 18s)
* 15:06 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 15:04 sukhe@dns1004: END - running authdns-update
* 15:03 sukhe@dns1004: START - running authdns-update
* 15:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:02 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1244713{{!}}cirrus: Add semantic search test cluster (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:02 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
* 15:02 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
* 15:00 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1244713{{!}}cirrus: Add semantic search test cluster (T413969)]]
* 14:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:53 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 14:50 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 14:38 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
* 14:38 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
* 14:32 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
* 14:32 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
* 14:32 sukhe@dns1004: END - running authdns-update
* 14:30 sukhe@dns1004: START - running authdns-update
* 14:28 elukey@cumin1003: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 14:28 sukhe@dns1004: START - running authdns-update
* 14:27 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1231.eqiad.wmnet
* 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1230.eqiad.wmnet
* 14:24 bking@dns1004: START - running authdns-update
* 14:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1230.eqiad.wmnet
* 14:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1229.eqiad.wmnet
* 14:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 14:05 moritzm: imported nodejs 24.14.0-1nodesource1 to thirdparty/node24 [[phab:T418440|T418440]]
* 14:03 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1229.eqiad.wmnet
* 14:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1228.eqiad.wmnet
* 14:01 moritzm: initialised ganeti02/ulsfo cluster [[phab:T418993|T418993]]
* 13:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:53 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:52 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1228.eqiad.wmnet
* 13:52 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1227.eqiad.wmnet
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:46 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:43 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:42 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1199.eqiad.wmnet
* 13:40 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1227.eqiad.wmnet
* 13:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1226.eqiad.wmnet
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:35 moritzm: installing glib2.0 security updates
* 13:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:26 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1226.eqiad.wmnet
* 13:26 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1225.eqiad.wmnet
* 13:26 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1225.eqiad.wmnet
* 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1224.eqiad.wmnet
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4006.ulsfo.wmnet with OS bookworm
* 13:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:07 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new VIP for routed ganeti in ulsfo - jmm@cumin2002"
* 13:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new VIP for routed ganeti in ulsfo - jmm@cumin2002"
* 13:06 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:02 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1224.eqiad.wmnet
* 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1223.eqiad.wmnet
* 13:00 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:58 cgoubert@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on wikikube-worker1162.eqiad.wmnet with reason: dcops intervention
* 12:57 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1162.eqiad.wmnet
* 12:56 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1162.eqiad.wmnet
* 12:55 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 12:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1223.eqiad.wmnet
* 12:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1222.eqiad.wmnet
* 12:46 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 12:43 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 12:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1222.eqiad.wmnet
* 12:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1221.eqiad.wmnet
* 12:23 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1221.eqiad.wmnet
* 12:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1220.eqiad.wmnet
* 12:23 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4006.ulsfo.wmnet with OS bookworm
* 12:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1220.eqiad.wmnet
* 11:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4005.ulsfo.wmnet
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4005.ulsfo.wmnet
* 11:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1236.eqiad.wmnet
* 11:29 moritzm: remove ganeti4006 from ganeti/ulsfo cluster [[phab:T418993|T418993]]
* 11:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1236.eqiad.wmnet
* 11:25 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1235.eqiad.wmnet
* 11:16 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1235.eqiad.wmnet
* 11:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1234.eqiad.wmnet
* 11:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1234.eqiad.wmnet
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1233.eqiad.wmnet
* 11:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1233.eqiad.wmnet
* 11:02 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1232.eqiad.wmnet
* 11:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 11:00 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4005.ulsfo.wmnet with OS bookworm
* 10:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 10:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1232.eqiad.wmnet
* 10:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1231.eqiad.wmnet
* 10:54 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 10:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1231.eqiad.wmnet
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1230.eqiad.wmnet
* 10:41 elukey@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4005.ulsfo.wmnet with reason: host reimage
* 10:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1230.eqiad.wmnet
* 10:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1229.eqiad.wmnet
* 10:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4005.ulsfo.wmnet with reason: host reimage
* 10:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1229.eqiad.wmnet
* 10:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1228.eqiad.wmnet
* 10:24 moritzm: installing Java 8 security updates
* 10:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1228.eqiad.wmnet
* 10:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1227.eqiad.wmnet
* 10:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1227.eqiad.wmnet
* 10:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1226.eqiad.wmnet
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 10:11 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4005.ulsfo.wmnet with OS bookworm
* 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ganeti4005.ulsfo.wmnet
* 10:08 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ganeti4005.ulsfo.wmnet
* 10:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add gw-virtual.ulsfo.wmnet - ayounsi@cumin1003"
* 10:07 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1226.eqiad.wmnet
* 10:07 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1225.eqiad.wmnet
* 09:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1225.eqiad.wmnet
* 09:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1224.eqiad.wmnet
* 09:52 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1224.eqiad.wmnet
* 09:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1223.eqiad.wmnet
* 09:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1223.eqiad.wmnet
* 09:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1222.eqiad.wmnet
* 09:43 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add gw-virtual.ulsfo.wmnet - ayounsi@cumin1003"
* 09:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1222.eqiad.wmnet
* 09:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1221.eqiad.wmnet
* 09:32 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:32 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:28 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1221.eqiad.wmnet
* 09:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1220.eqiad.wmnet
* 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1220.eqiad.wmnet
* 09:02 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 08:38 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247990{{!}}Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580)]] (duration: 07m 07s)
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/airflow-main: apply
* 08:34 mszwarc@deploy2002: mszwarc: Continuing with sync
* 08:33 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1247990{{!}}Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:30 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1247990{{!}}Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580)]]
* 08:29 gehel@dns1004: END - running authdns-update
* 08:28 gehel@dns1004: START - running authdns-update
* 08:27 moritzm: installing mbedtls security updates
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 08:15 hashar@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248314{{!}}Revert "zhwiki: Add 2026 CNY celebration logos"]] (duration: 09m 19s)
* 08:11 hashar@deploy2002: hashar, stang: Continuing with sync
* 08:08 hashar@deploy2002: hashar, stang: Backport for [[gerrit:1248314{{!}}Revert "zhwiki: Add 2026 CNY celebration logos"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:06 hashar@deploy2002: Started scap sync-world: Backport for [[gerrit:1248314{{!}}Revert "zhwiki: Add 2026 CNY celebration logos"]]
* 08:02 moritzm: uploaded openjdk-8 8u482-ga-1~deb11u1 to component/jdk8 of bullseye-wikimedia
* 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts bast4005.wikimedia.org
* 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast4005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast4005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:48 moritzm: uploaded bird2 2.18-1~wmf13u2 to the main component of trixie-wikimedia [[phab:T413740|T413740]]
* 07:47 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:47 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 07:42 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts bast4005.wikimedia.org
* 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Remove es1033 [[phab:T408772|T408772]]', diff saved to https://phabricator.wikimedia.org/P89804 and previous config saved to /var/cache/conftool/dbconfig/20260305-063548-marostegui.json
* 02:10 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 55s)
* 02:02 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 02:01 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248163{{!}}Stop writing to il_to on medium size wikis (T415787)]] (duration: 06m 14s)
* 01:58 zabe@deploy2002: zabe: Continuing with sync
* 01:57 zabe@deploy2002: zabe: Backport for [[gerrit:1248163{{!}}Stop writing to il_to on medium size wikis (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:55 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248163{{!}}Stop writing to il_to on medium size wikis (T415787)]]
* 01:40 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1246099{{!}}Start reading from new file tables on medium wikis (T416548)]] (duration: 06m 15s)
* 01:36 zabe@deploy2002: zabe: Continuing with sync
* 01:36 zabe@deploy2002: zabe: Backport for [[gerrit:1246099{{!}}Start reading from new file tables on medium wikis (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:34 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1246099{{!}}Start reading from new file tables on medium wikis (T416548)]]
* 01:29 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248154{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]], [[gerrit:1248153{{!}}Revert^2 "ImageListPager: Properly support file schema migration read new"]] (duration: 07m 21s)
* 01:25 zabe@deploy2002: zabe: Continuing with sync
* 01:23 zabe@deploy2002: zabe: Backport for [[gerrit:1248154{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]], [[gerrit:1248153{{!}}Revert^2 "ImageListPager: Properly support file schema migration read new"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:21 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248154{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]], [[gerrit:1248153{{!}}Revert^2 "ImageListPager: Properly support file schema migration read new"]]
* 00:55 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248021{{!}}Stop writing to il_to on small wikis (T415787)]] (duration: 06m 49s)
* 00:51 zabe@deploy2002: zabe: Continuing with sync
* 00:50 zabe@deploy2002: zabe: Backport for [[gerrit:1248021{{!}}Stop writing to il_to on small wikis (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:48 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248021{{!}}Stop writing to il_to on small wikis (T415787)]]
* 00:19 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248125{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]], [[gerrit:1248123{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]] (duration: 08m 52s)
* 00:13 zabe@deploy2002: zabe: Continuing with sync
* 00:12 zabe@deploy2002: zabe: Backport for [[gerrit:1248125{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]], [[gerrit:1248123{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:10 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248125{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]], [[gerrit:1248123{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]]
== 2026-03-04 ==
* 22:57 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 22:56 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 22:55 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 22:55 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 22:55 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 22:54 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 22:35 tgr_: UTC late deploys done
* 22:33 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248084{{!}}Introduce a Semantic Search query route and builder (T413969)]], [[gerrit:1248085{{!}}Wire up semantic query building (T413969)]] (duration: 38m 28s)
* 22:16 tgr@deploy2002: tgr, ebernhardson: Continuing with sync
* 22:14 tgr@deploy2002: tgr, ebernhardson: Backport for [[gerrit:1248084{{!}}Introduce a Semantic Search query route and builder (T413969)]], [[gerrit:1248085{{!}}Wire up semantic query building (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248084{{!}}Introduce a Semantic Search query route and builder (T413969)]], [[gerrit:1248085{{!}}Wire up semantic query building (T413969)]]
* 21:48 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248012{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999)]] (duration: 07m 05s)
* 21:47 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on dse-k8s-worker1028.eqiad.wmnet with reason: broken networking
* 21:44 tgr@deploy2002: tgr: Continuing with sync
* 21:43 tgr@deploy2002: tgr: Backport for [[gerrit:1248012{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:40 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248012{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999)]]
* 21:36 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248081{{!}}Add synthetic AAA experiment (T418614)]], [[gerrit:1248080{{!}}Add synthetic AAA experiment (T418614)]] (duration: 09m 11s)
* 21:35 bking@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
* 21:32 tgr@deploy2002: cjming, tgr: Continuing with sync
* 21:30 bking@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
* 21:29 tgr@deploy2002: cjming, tgr: Backport for [[gerrit:1248081{{!}}Add synthetic AAA experiment (T418614)]], [[gerrit:1248080{{!}}Add synthetic AAA experiment (T418614)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:27 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248081{{!}}Add synthetic AAA experiment (T418614)]], [[gerrit:1248080{{!}}Add synthetic AAA experiment (T418614)]]
* 21:21 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1245473{{!}}logging: set poolcounter channel log level to info (T418612)]] (duration: 09m 04s)
* 21:17 tgr@deploy2002: tgr, cwhite: Continuing with sync
* 21:14 tgr@deploy2002: tgr, cwhite: Backport for [[gerrit:1245473{{!}}logging: set poolcounter channel log level to info (T418612)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:12 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1245473{{!}}logging: set poolcounter channel log level to info (T418612)]]
* 21:07 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248007{{!}}Fix $wgJwtSessionCookieIssuer (T415007 T418999)]] (duration: 09m 55s)
* 21:03 tgr@deploy2002: tgr: Continuing with sync
* 20:59 tgr@deploy2002: tgr: Backport for [[gerrit:1248007{{!}}Fix $wgJwtSessionCookieIssuer (T415007 T418999)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:57 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248007{{!}}Fix $wgJwtSessionCookieIssuer (T415007 T418999)]]
* 19:56 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 19:44 jhuneidi@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248011{{!}}CategoryViewer: Fall back to empty string in case of missing nextpage (T418934)]] (duration: 10m 47s)
* 19:44 cdobbins@cumin2002: conftool action : set/pooled=yes:weight=1; selector: name=cp205[0-8].codfw.wmnet
* 19:43 cdobbins@cumin2002: conftool action : set/pooled=yes:weight=1; selector: name=cp2049.codfw.wmnet
* 19:40 jhuneidi@deploy2002: zabe, jhuneidi: Continuing with sync
* 19:35 jhuneidi@deploy2002: zabe, jhuneidi: Backport for [[gerrit:1248011{{!}}CategoryViewer: Fall back to empty string in case of missing nextpage (T418934)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:34 brett@puppetserver1001: conftool action : set/weight=1; selector: name=cp2043.*
* 19:34 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
* 19:33 jhuneidi@deploy2002: Started scap sync-world: Backport for [[gerrit:1248011{{!}}CategoryViewer: Fall back to empty string in case of missing nextpage (T418934)]]
* 19:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2043.codfw.wmnet with OS trixie
* 19:23 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 19:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 19:22 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 19:22 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 19:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2043.codfw.wmnet with reason: host reimage
* 19:06 brett@puppetserver1001: conftool action : set/weight=1; selector: name=cp204[45678].*
* 19:04 brett@puppetserver1001: conftool action : set/weight=100; selector: name=cp204[45678].*
* 19:02 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2043.codfw.wmnet with reason: host reimage
* 18:58 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp204[45678].*
* 18:52 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 18:51 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 18:50 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 18:50 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 18:49 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:49 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:49 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 18:48 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 18:48 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 18:47 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 18:47 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 18:47 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS trixie
* 18:46 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 18:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 18:42 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 18:41 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 18:41 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 18:39 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 18:39 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 18:38 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 18:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 18:37 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:32 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:16 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:16 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:16 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:15 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:15 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:14 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:14 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:13 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:12 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:12 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2047.codfw.wmnet with OS trixie
* 17:33 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 17:27 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 17:23 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:23 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:18 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:18 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:15 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:13 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:09 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp2047.codfw.wmnet with OS trixie
* 16:55 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:55 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:54 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:54 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1007.eqiad.wmnet with OS bookworm
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-unlock-scap (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:39 root@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Datacenter switchover from eqiad to codfw - [[phab:T418133|T418133]] (duration: 25m 37s)
* 16:39 root@deploy2002: Forcefully removing global lock: Datacenter switchover from eqiad to codfw - [[phab:T418133|T418133]]
* 16:39 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-unlock-scap for datacenter switchover from eqiad to codfw
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:27 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters for datacenter switchover from eqiad to codfw
* 16:27 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-restore-ttl (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:26 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-restore-ttl for datacenter switchover from eqiad to codfw
* 16:26 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:26 root@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 16:26 root@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 16:26 root@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 16:26 root@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 16:26 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance for datacenter switchover from eqiad to codfw
* 16:25 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:25 root@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: sync
* 16:25 root@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: sync
* 16:25 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: [DRY-RUN] MediaWiki read-only period ends at: 2026-03-04 16:24:40.502004
* 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.07-set-readwrite for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki for datacenter switchover from eqiad to codfw
* 16:23 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:23 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.03-set-db-readonly for datacenter switchover from eqiad to codfw
* 16:23 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:22 blake@cumin1003: [DRY-RUN] MediaWiki read-only period starts at: 2026-03-04 16:22:41.755892
* 16:22 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.02-set-readonly for datacenter switchover from eqiad to codfw
* 16:20 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 16:20 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:20 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance for datacenter switchover from eqiad to codfw
* 16:19 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:14 moritzm: upgrading cloudservices* to Bird 2.18 [[phab:T413740|T413740]]
* 16:14 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl for datacenter switchover from eqiad to codfw
* 16:13 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-lock-scap (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:13 root@deploy2002: Locking from deployment [ALL REPOSITORIES]: Datacenter switchover from eqiad to codfw - [[phab:T418133|T418133]]
* 16:13 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-lock-scap for datacenter switchover from eqiad to codfw
* 16:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:10 moritzm: remove ganeti4005 from ganeti/ulsfo cluster [[phab:T418993|T418993]]
* 16:10 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1007.eqiad.wmnet with OS bookworm
* 16:06 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:06 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks for datacenter switchover from eqiad to codfw
* 15:59 XioNoX: push pfw policies - [[phab:T418402|T418402]]
* 15:37 sukhe@dns1004: END - running authdns-update
* 15:36 sukhe@dns1004: START - running authdns-update
* 15:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1219.eqiad.wmnet
* 15:32 aqu@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 15:31 aqu@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 15:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
* 15:29 cgoubert@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>ms-fe10[14-24].*<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 15:24 cgoubert@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P<nowiki>{</nowiki>ms-fe10[14-24].*<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 15:22 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:22 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:22 cgoubert@cumin1003: END (ERROR) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=97) rolling restart_daemons on A:swift-fe-eqiad
* 15:21 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1219.eqiad.wmnet
* 15:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1218.eqiad.wmnet
* 15:19 cgoubert@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
* 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1120.eqiad.wmnet
* 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1121.eqiad.wmnet
* 15:16 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1115.eqiad.wmnet [reason: [[phab:T418772|T418772]] - BGP maintenance]
* 15:16 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1122.eqiad.wmnet
* 15:15 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:15 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:14 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:13 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:13 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:10 XioNoX: lsw1-d7-eqiad# tools network-instance default protocols bgp neighbor 10.64.128.17 reset-peer - [[phab:T418772|T418772]]
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
* 15:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1218.eqiad.wmnet
* 15:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1217.eqiad.wmnet
* 15:09 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:05 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:05 moritzm: upgrading cloudlb* to Bird 2.18 [[phab:T413740|T413740]]
* 15:05 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:04 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:58 Dreamy_Jazz: Afternoon UTC backport window done
* 14:58 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244373{{!}}zhwiki: Remove all rights from accountcreator (T418089)]] (duration: 08m 12s)
* 14:57 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:57 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1217.eqiad.wmnet
* 14:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1216.eqiad.wmnet
* 14:57 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:56 btullis@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on dse-k8s-worker[1010-1011,1013,1018-1019].eqiad.wmnet with reason: Adding 10 Gbps NIC
* 14:54 dreamyjazz@deploy2002: dreamyjazz, 1f616emo: Continuing with sync
* 14:52 jmm@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
* 14:52 dreamyjazz@deploy2002: dreamyjazz, 1f616emo: Backport for [[gerrit:1244373{{!}}zhwiki: Remove all rights from accountcreator (T418089)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:50 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1244373{{!}}zhwiki: Remove all rights from accountcreator (T418089)]]
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1216.eqiad.wmnet
* 14:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1215.eqiad.wmnet
* 14:44 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248009{{!}}Hooks: Fix liquidthreads log type definition bugs (T417425 T419006)]], [[gerrit:1248008{{!}}Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425)]] (duration: 07m 11s)
* 14:44 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1115.eqiad.wmnet [reason: [[phab:T418772|T418772]] - BGP maintenance]
* 14:44 taavi: updating CR firewall policy with https://gerrit.wikimedia.org/r/c/operations/homer/public/+/970275
* 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1122.eqiad.wmnet
* 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1121.eqiad.wmnet
* 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1120.eqiad.wmnet
* 14:40 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 14:39 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1248009{{!}}Hooks: Fix liquidthreads log type definition bugs (T417425 T419006)]], [[gerrit:1248008{{!}}Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:37 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1248009{{!}}Hooks: Fix liquidthreads log type definition bugs (T417425 T419006)]], [[gerrit:1248008{{!}}Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425)]]
* 14:33 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1215.eqiad.wmnet
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1214.eqiad.wmnet
* 14:32 btullis@puppetserver1001: conftool action : get/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1025.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1025.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
* 14:30 btullis@puppetserver1001: conftool action : get/pooled; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
* 14:29 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_esams and A:cp - 3.0 upgrade ()
* 14:27 arnaudb@dns1004: END - running authdns-update
* 14:26 arnaudb@dns1004: START - running authdns-update
* 14:26 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248000{{!}}Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999)]] (duration: 07m 19s)
* 14:22 tgr@deploy2002: tgr: Continuing with sync
* 14:21 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1214.eqiad.wmnet
* 14:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1213.eqiad.wmnet
* 14:21 tgr@deploy2002: tgr: Backport for [[gerrit:1248000{{!}}Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:19 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248000{{!}}Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999)]]
* 14:14 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247566{{!}}Enable new HTML confirmation emails for all (T416748)]] (duration: 07m 46s)
* 14:13 elukey@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:13 elukey@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:10 sgimeno@deploy2002: migr, sgimeno: Continuing with sync
* 14:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1213.eqiad.wmnet
* 14:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1212.eqiad.wmnet
* 14:09 sgimeno@deploy2002: migr, sgimeno: Backport for [[gerrit:1247566{{!}}Enable new HTML confirmation emails for all (T416748)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 elukey@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 14:08 elukey@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 14:08 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:07 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:07 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1247566{{!}}Enable new HTML confirmation emails for all (T416748)]]
* 13:59 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
* 13:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
* 13:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
* 13:57 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1212.eqiad.wmnet
* 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1211.eqiad.wmnet
* 13:49 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_esams and A:cp - 3.0 upgrade ()
* 13:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1211.eqiad.wmnet
* 13:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1210.eqiad.wmnet
* 13:43 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_esams and A:cp - 3.0 upgrade ()
* 13:40 arnaudb@dns1004: END - running authdns-update
* 13:39 arnaudb@dns1004: START - running authdns-update
* 13:37 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 13:33 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1210.eqiad.wmnet
* 13:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1209.eqiad.wmnet
* 13:20 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1209.eqiad.wmnet
* 13:20 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1208.eqiad.wmnet
* 13:17 aokoth@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:17 aokoth@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:16 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:15 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1208.eqiad.wmnet
* 13:06 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1207.eqiad.wmnet
* 13:03 arnaudb@dns1005: END - running authdns-update
* 13:02 arnaudb@dns1005: START - running authdns-update
* 13:00 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_esams and A:cp - 3.0 upgrade ()
* 13:00 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 12:46 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 12:45 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 12:44 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 12:44 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
* 12:43 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 12:43 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
* 12:33 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 12:29 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
* 12:10 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 12:08 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp - 3.0 upgrade ()
* 12:03 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1207.eqiad.wmnet
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1206.eqiad.wmnet
* 11:53 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1206.eqiad.wmnet
* 11:53 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1205.eqiad.wmnet
* 11:36 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f8-eqiad
* 11:36 jmm@cumin2002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
* 11:34 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp - 3.0 upgrade ()
* 11:34 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 11:28 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247968{{!}}SI: Update instrumentation schema (T418293)]] (duration: 16m 22s)
* 11:22 fabfur: start upgrading haproxy to 3.0 on A:cp-eqiad ([[phab:T417253|T417253]])
* 11:22 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 11:17 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1247968{{!}}SI: Update instrumentation schema (T418293)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:13 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp - 3.0 upgrade ()
* 11:12 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1247968{{!}}SI: Update instrumentation schema (T418293)]]
* 11:08 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp - 3.0 upgrade ()
* 11:07 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 11:07 blake@cumin1003: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 11:06 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2356].codfw.wmnet
* 11:06 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2356].codfw.wmnet
* 11:03 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 11:03 blake@cumin1003: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 10:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2356].codfw.wmnet
* 10:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2356].codfw.wmnet
* 10:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1205.eqiad.wmnet
* 10:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1204.eqiad.wmnet
* 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:42 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1204.eqiad.wmnet
* 10:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1203.eqiad.wmnet
* 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1203.eqiad.wmnet
* 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1202.eqiad.wmnet
* 10:25 fabfur: start upgrading haproxy to 3.0 on A:cp-drmrs ([[phab:T417253|T417253]])
* 10:25 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp - 3.0 upgrade ()
* 10:25 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp - 3.0 upgrade ()
* 10:24 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247941{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]], [[gerrit:1247944{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]] (duration: 06m 42s)
* 10:22 arnaudb@dns1004: END - running authdns-update
* 10:20 arnaudb@dns1004: START - running authdns-update
* 10:20 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 10:20 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1247941{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]], [[gerrit:1247944{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:18 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1247941{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]], [[gerrit:1247944{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]]
* 10:16 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1202.eqiad.wmnet
* 10:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1201.eqiad.wmnet
* 10:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:04 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1201.eqiad.wmnet
* 10:04 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1200.eqiad.wmnet
* 09:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1200.eqiad.wmnet
* 09:39 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247925{{!}}Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880)]] (duration: 08m 23s)
* 09:36 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_codfw and A:cp - 3.0 upgrade ()
* 09:35 mszwarc@deploy2002: mszwarc: Continuing with sync
* 09:33 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1247925{{!}}Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet
* 09:31 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp - 3.0 upgrade ()
* 09:31 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1247925{{!}}Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880)]]
* 09:30 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:03 gehel: switching off Blazegraph on wdqs2009 (legacy full graph endpoint is end of life) - [[phab:T411410|T411410]] / [[phab:T415073|T415073]]
* 09:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:02 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet
* 09:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 08:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 08:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:56 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet
* 08:54 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 08:52 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 08:49 topranks: disabling IBGP session between ssw1-d1-eqiad and ssw1-d8-eqiad to remove backup paths try #2 [[phab:T411054|T411054]]
* 08:36 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on backup1007.eqiad.wmnet,dbprov1004.eqiad.wmnet with reason: network maintenance
* 08:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:31 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:21 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp - 3.0 upgrade ()
* 08:21 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_codfw and A:cp - 3.0 upgrade ()
* 08:11 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5032.*
* 07:54 topranks: disabling IBGP session between ssw1-d1-eqiad and ssw1-d8-eqiad to remove backup paths [[phab:T411054|T411054]]
* 07:43 moritzm: installing libbpf updates from Bookworm point release
* 05:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 05:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 04s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 01:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89793 and previous config saved to /var/cache/conftool/dbconfig/20260304-015657-marostegui.json
* 01:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P89792 and previous config saved to /var/cache/conftool/dbconfig/20260304-014150-marostegui.json
* 01:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P89791 and previous config saved to /var/cache/conftool/dbconfig/20260304-012642-marostegui.json
* 01:23 zabe@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 01:22 zabe@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89790 and previous config saved to /var/cache/conftool/dbconfig/20260304-011134-marostegui.json
* 00:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1263 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89789 and previous config saved to /var/cache/conftool/dbconfig/20260304-004638-marostegui.json
* 00:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1263.eqiad.wmnet with reason: Maintenance
* 00:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89788 and previous config saved to /var/cache/conftool/dbconfig/20260304-004615-marostegui.json
* 00:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P89787 and previous config saved to /var/cache/conftool/dbconfig/20260304-003107-marostegui.json
* 00:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P89786 and previous config saved to /var/cache/conftool/dbconfig/20260304-001559-marostegui.json
* 00:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89785 and previous config saved to /var/cache/conftool/dbconfig/20260304-000052-marostegui.json
== 2026-03-03 ==
* 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1262 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89784 and previous config saved to /var/cache/conftool/dbconfig/20260303-233500-marostegui.json
* 23:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1262.eqiad.wmnet with reason: Maintenance
* 23:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89783 and previous config saved to /var/cache/conftool/dbconfig/20260303-233436-marostegui.json
* 23:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P89782 and previous config saved to /var/cache/conftool/dbconfig/20260303-231929-marostegui.json
* 23:10 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 23:08 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 23:08 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:07 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:05 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 23:05 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 23:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P89781 and previous config saved to /var/cache/conftool/dbconfig/20260303-230421-marostegui.json
* 23:04 bking@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
* 23:02 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247689{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247690{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247596{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007)]] (duration: 21m 47s)
* 23:00 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7008.magru.wmnet [reason: lldpd packet drop issues]
* 22:58 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7008 [reason: lldpd packet drop issues]
* 22:58 tgr@deploy2002: tgr: Continuing with sync
* 22:56 bking@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
* 22:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89780 and previous config saved to /var/cache/conftool/dbconfig/20260303-224913-marostegui.json
* 22:45 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:45 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:44 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:44 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:42 tgr@deploy2002: tgr: Backport for [[gerrit:1247689{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247690{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247596{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1247689{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247690{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247596{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007)]]
* 22:26 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 22:26 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 22:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1261 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89779 and previous config saved to /var/cache/conftool/dbconfig/20260303-222324-marostegui.json
* 22:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1261.eqiad.wmnet with reason: Maintenance
* 22:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89778 and previous config saved to /var/cache/conftool/dbconfig/20260303-222301-marostegui.json
* 22:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P89777 and previous config saved to /var/cache/conftool/dbconfig/20260303-220754-marostegui.json
* 21:59 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1245162 [[phab:T411807|T411807]] (duration: 12m 15s)
* 21:58 rzl@deploy2002: rzl: Continuing with sync
* 21:56 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:56 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:55 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1245162 [[phab:T411807|T411807]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1245162 [[phab:T411807|T411807]]
* 21:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P89776 and previous config saved to /var/cache/conftool/dbconfig/20260303-215247-marostegui.json
* 21:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89775 and previous config saved to /var/cache/conftool/dbconfig/20260303-214931-marostegui.json
* 21:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2045.codfw.wmnet
* 21:48 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp2045.codfw.wmnet
* 21:40 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:39 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89774 and previous config saved to /var/cache/conftool/dbconfig/20260303-213739-marostegui.json
* 21:35 jhuneidi@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244748{{!}}REST: show the beta Attribution API in the REST Sandbox (T418522)]] (duration: 07m 41s)
* 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P89773 and previous config saved to /var/cache/conftool/dbconfig/20260303-213423-marostegui.json
* 21:32 jhuneidi@deploy2002: jhuneidi, bpirkle: Continuing with sync
* 21:30 jhuneidi@deploy2002: jhuneidi, bpirkle: Backport for [[gerrit:1244748{{!}}REST: show the beta Attribution API in the REST Sandbox (T418522)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:28 jhuneidi@deploy2002: Started scap sync-world: Backport for [[gerrit:1244748{{!}}REST: show the beta Attribution API in the REST Sandbox (T418522)]]
* 21:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P89772 and previous config saved to /var/cache/conftool/dbconfig/20260303-211915-marostegui.json
* 21:18 jhuneidi@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247652{{!}}Remove redundant mw-extra wgRestSandboxSpecs entry]] (duration: 06m 56s)
* 21:14 jhuneidi@deploy2002: jhuneidi, aaron: Continuing with sync
* 21:13 jhuneidi@deploy2002: jhuneidi, aaron: Backport for [[gerrit:1247652{{!}}Remove redundant mw-extra wgRestSandboxSpecs entry]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:11 jhuneidi@deploy2002: Started scap sync-world: Backport for [[gerrit:1247652{{!}}Remove redundant mw-extra wgRestSandboxSpecs entry]]
* 21:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1260 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89771 and previous config saved to /var/cache/conftool/dbconfig/20260303-211033-marostegui.json
* 21:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1260.eqiad.wmnet with reason: Maintenance
* 21:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89770 and previous config saved to /var/cache/conftool/dbconfig/20260303-211009-marostegui.json
* 21:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89769 and previous config saved to /var/cache/conftool/dbconfig/20260303-210407-marostegui.json
* 20:58 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2045.codfw.wmnet with reason: troubleshooting for [[phab:T418527|T418527]]
* 20:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P89768 and previous config saved to /var/cache/conftool/dbconfig/20260303-205502-marostegui.json
* 20:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7008.magru.wmnet with OS trixie
* 20:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89767 and previous config saved to /var/cache/conftool/dbconfig/20260303-204452-marostegui.json
* 20:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2248.codfw.wmnet with reason: Maintenance
* 20:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89766 and previous config saved to /var/cache/conftool/dbconfig/20260303-204439-marostegui.json
* 20:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P89765 and previous config saved to /var/cache/conftool/dbconfig/20260303-203954-marostegui.json
* 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P89764 and previous config saved to /var/cache/conftool/dbconfig/20260303-202931-marostegui.json
* 20:24 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7008.magru.wmnet with reason: host reimage
* 20:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89763 and previous config saved to /var/cache/conftool/dbconfig/20260303-202447-marostegui.json
* 20:17 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7008.magru.wmnet with reason: host reimage
* 20:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P89762 and previous config saved to /var/cache/conftool/dbconfig/20260303-201423-marostegui.json
* 20:10 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1199.eqiad.wmnet
* 19:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89761 and previous config saved to /var/cache/conftool/dbconfig/20260303-195916-marostegui.json
* 19:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1252 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89760 and previous config saved to /var/cache/conftool/dbconfig/20260303-195900-marostegui.json
* 19:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1252.eqiad.wmnet with reason: Maintenance
* 19:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89759 and previous config saved to /var/cache/conftool/dbconfig/20260303-195835-marostegui.json
* 19:51 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7008.magru.wmnet with OS trixie
* 19:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P89758 and previous config saved to /var/cache/conftool/dbconfig/20260303-194327-marostegui.json
* 19:42 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2043.codfw.wmnet
* 19:42 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp2043.codfw.wmnet
* 19:33 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89757 and previous config saved to /var/cache/conftool/dbconfig/20260303-193351-marostegui.json
* 19:33 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 19:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89756 and previous config saved to /var/cache/conftool/dbconfig/20260303-193338-marostegui.json
* 19:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P89755 and previous config saved to /var/cache/conftool/dbconfig/20260303-192820-marostegui.json
* 19:19 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 19:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P89754 and previous config saved to /var/cache/conftool/dbconfig/20260303-191830-marostegui.json
* 19:17 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2047.codfw.wmnet with OS trixie
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89753 and previous config saved to /var/cache/conftool/dbconfig/20260303-191312-marostegui.json
* 19:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P89752 and previous config saved to /var/cache/conftool/dbconfig/20260303-190323-marostegui.json
* 18:53 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 18:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
* 18:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1198.eqiad.wmnet
* 18:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1249 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89751 and previous config saved to /var/cache/conftool/dbconfig/20260303-184937-marostegui.json
* 18:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1249.eqiad.wmnet with reason: Maintenance
* 18:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89750 and previous config saved to /var/cache/conftool/dbconfig/20260303-184913-marostegui.json
* 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89749 and previous config saved to /var/cache/conftool/dbconfig/20260303-184815-marostegui.json
* 18:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 18:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1096.eqiad.wmnet with OS bullseye
* 18:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1198.eqiad.wmnet
* 18:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1197.eqiad.wmnet
* 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P89747 and previous config saved to /var/cache/conftool/dbconfig/20260303-183406-marostegui.json
* 18:29 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp2047.codfw.wmnet with OS trixie
* 18:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1197.eqiad.wmnet
* 18:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1196.eqiad.wmnet
* 18:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89746 and previous config saved to /var/cache/conftool/dbconfig/20260303-182346-marostegui.json
* 18:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1096.eqiad.wmnet with reason: host reimage
* 18:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 18:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89745 and previous config saved to /var/cache/conftool/dbconfig/20260303-182321-marostegui.json
* 18:19 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1096.eqiad.wmnet with reason: host reimage
* 18:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P89744 and previous config saved to /var/cache/conftool/dbconfig/20260303-181859-marostegui.json
* 18:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1196.eqiad.wmnet
* 18:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1195.eqiad.wmnet
* 18:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P89743 and previous config saved to /var/cache/conftool/dbconfig/20260303-180814-marostegui.json
* 18:04 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247635{{!}}Style fixes for copy-paste feature (T414072)]] (duration: 32m 54s)
* 18:04 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89742 and previous config saved to /var/cache/conftool/dbconfig/20260303-180352-marostegui.json
* 18:02 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1096.eqiad.wmnet with OS bullseye
* 18:02 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1195.eqiad.wmnet
* 17:59 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host an-worker1194.eqiad.wmnet
* 17:55 ariel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:53 ariel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P89741 and previous config saved to /var/cache/conftool/dbconfig/20260303-175304-marostegui.json
* 17:52 jforrester@deploy2002: jforrester: Continuing with sync
* 17:51 jforrester@deploy2002: jforrester: Backport for [[gerrit:1247635{{!}}Style fixes for copy-paste feature (T414072)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:47 ariel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:46 ariel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1194.eqiad.wmnet
* 17:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1193.eqiad.wmnet
* 17:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89740 and previous config saved to /var/cache/conftool/dbconfig/20260303-173914-marostegui.json
* 17:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1248.eqiad.wmnet with reason: Maintenance
* 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89739 and previous config saved to /var/cache/conftool/dbconfig/20260303-173850-marostegui.json
* 17:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89738 and previous config saved to /var/cache/conftool/dbconfig/20260303-173756-marostegui.json
* 17:31 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1247635{{!}}Style fixes for copy-paste feature (T414072)]]
* 17:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1193.eqiad.wmnet
* 17:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1192.eqiad.wmnet
* 17:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P89736 and previous config saved to /var/cache/conftool/dbconfig/20260303-172343-marostegui.json
* 17:18 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1192.eqiad.wmnet
* 17:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1191.eqiad.wmnet
* 17:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89735 and previous config saved to /var/cache/conftool/dbconfig/20260303-171149-marostegui.json
* 17:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 17:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89734 and previous config saved to /var/cache/conftool/dbconfig/20260303-171126-marostegui.json
* 17:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P89733 and previous config saved to /var/cache/conftool/dbconfig/20260303-170835-marostegui.json
* 17:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1191.eqiad.wmnet
* 17:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1190.eqiad.wmnet
* 16:56 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1190.eqiad.wmnet
* 16:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P89732 and previous config saved to /var/cache/conftool/dbconfig/20260303-165618-marostegui.json
* 16:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89731 and previous config saved to /var/cache/conftool/dbconfig/20260303-165327-marostegui.json
* 16:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1189.eqiad.wmnet
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P89730 and previous config saved to /var/cache/conftool/dbconfig/20260303-164111-marostegui.json
* 16:34 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1189.eqiad.wmnet
* 16:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1188.eqiad.wmnet
* 16:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89729 and previous config saved to /var/cache/conftool/dbconfig/20260303-162845-marostegui.json
* 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Setting x1 codfw weights to 300 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89728 and previous config saved to /var/cache/conftool/dbconfig/20260303-162836-fceratto.json
* 16:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1247.eqiad.wmnet with reason: Maintenance
* 16:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89727 and previous config saved to /var/cache/conftool/dbconfig/20260303-162603-marostegui.json
* 16:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 16:18 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1188 weight to 100 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89726 and previous config saved to /var/cache/conftool/dbconfig/20260303-161846-fceratto.json
* 16:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 16:17 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1188.eqiad.wmnet
* 16:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1187.eqiad.wmnet
* 16:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1166: testing:crash
* 16:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1166: testing:crash
* 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1169 weight to 300 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89724 and previous config saved to /var/cache/conftool/dbconfig/20260303-161323-fceratto.json
* 16:12 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1188 weight to 300 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89723 and previous config saved to /var/cache/conftool/dbconfig/20260303-161230-fceratto.json
* 16:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 16:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89722 and previous config saved to /var/cache/conftool/dbconfig/20260303-160720-marostegui.json
* 16:07 brennen@deploy2002: Finished deploy [phabricator/deployment@a883b6d]: deploy phab1004 for [[phab:T418872|T418872]] (duration: 01m 07s)
* 16:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1187.eqiad.wmnet
* 16:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1186.eqiad.wmnet
* 16:05 brennen@deploy2002: Started deploy [phabricator/deployment@a883b6d]: deploy phab1004 for [[phab:T418872|T418872]]
* 16:05 brennen@deploy2002: Finished deploy [phabricator/deployment@a883b6d]: deploy phab2002 for [[phab:T418872|T418872]] (duration: 00m 32s)
* 16:04 brennen@deploy2002: Started deploy [phabricator/deployment@a883b6d]: deploy phab2002 for [[phab:T418872|T418872]]
* 16:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2240 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89721 and previous config saved to /var/cache/conftool/dbconfig/20260303-160207-marostegui.json
* 16:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 16:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2240.codfw.wmnet with reason: Maintenance
* 16:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 16:00 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247569{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]] (duration: 09m 28s)
* 15:54 zabe@deploy2002: zabe: Continuing with sync
* 15:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1186.eqiad.wmnet
* 15:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1185.eqiad.wmnet
* 15:54 zabe@deploy2002: zabe: Backport for [[gerrit:1247569{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:53 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 15:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P89720 and previous config saved to /var/cache/conftool/dbconfig/20260303-155212-marostegui.json
* 15:50 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1247569{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]]
* 15:49 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 15:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:42 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1185.eqiad.wmnet
* 15:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1184.eqiad.wmnet
* 15:42 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:41 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 15:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 15:41 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 15:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89719 and previous config saved to /var/cache/conftool/dbconfig/20260303-154104-marostegui.json
* 15:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P89718 and previous config saved to /var/cache/conftool/dbconfig/20260303-153704-marostegui.json
* 15:36 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 15:36 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 15:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1184.eqiad.wmnet
* 15:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1183.eqiad.wmnet
* 15:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P89717 and previous config saved to /var/cache/conftool/dbconfig/20260303-152557-marostegui.json
* 15:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1178.eqiad.wmnet
* 15:22 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp5032.*<nowiki>}</nowiki> and A:cp - 3.0 upgrade ()
* 15:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89716 and previous config saved to /var/cache/conftool/dbconfig/20260303-152157-marostegui.json
* 15:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1183.eqiad.wmnet
* 15:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1182.eqiad.wmnet
* 15:16 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp5032.*<nowiki>}</nowiki> and A:cp - 3.0 upgrade ()
* 15:15 fabfur@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=1) rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 15:14 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 15:14 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 15:13 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 15:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 15:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P89715 and previous config saved to /var/cache/conftool/dbconfig/20260303-151049-marostegui.json
* 15:08 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:07 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1182.eqiad.wmnet
* 15:07 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1181.eqiad.wmnet
* 14:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1244 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89714 and previous config saved to /var/cache/conftool/dbconfig/20260303-145727-marostegui.json
* 14:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1244.eqiad.wmnet with reason: Maintenance
* 14:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89713 and previous config saved to /var/cache/conftool/dbconfig/20260303-145704-marostegui.json
* 14:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89712 and previous config saved to /var/cache/conftool/dbconfig/20260303-145541-marostegui.json
* 14:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1181.eqiad.wmnet
* 14:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1180.eqiad.wmnet
* 14:49 moritzm: installing php7.4 security updates
* 14:46 jayme@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 14:46 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 14:43 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1180.eqiad.wmnet
* 14:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1179.eqiad.wmnet
* 14:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P89711 and previous config saved to /var/cache/conftool/dbconfig/20260303-144156-marostegui.json
* 14:38 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 14:38 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240716{{!}}Remove Editing-related config for special wikis (T400063)]] (duration: 06m 34s)
* 14:36 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:34 esanders@deploy2002: esanders: Continuing with sync
* 14:34 esanders@deploy2002: esanders: Backport for [[gerrit:1240716{{!}}Remove Editing-related config for special wikis (T400063)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:34 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:32 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1240716{{!}}Remove Editing-related config for special wikis (T400063)]]
* 14:31 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1179.eqiad.wmnet
* 14:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1178.eqiad.wmnet
* 14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89710 and previous config saved to /var/cache/conftool/dbconfig/20260303-143141-marostegui.json
* 14:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89709 and previous config saved to /var/cache/conftool/dbconfig/20260303-143117-marostegui.json
* 14:29 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247578{{!}}PasteCheck: Enable by default (T405127)]] (duration: 08m 01s)
* 14:27 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 14:27 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 14:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P89708 and previous config saved to /var/cache/conftool/dbconfig/20260303-142649-marostegui.json
* 14:26 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 14:25 esanders@deploy2002: esanders: Continuing with sync
* 14:23 esanders@deploy2002: esanders: Backport for [[gerrit:1247578{{!}}PasteCheck: Enable by default (T405127)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:21 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1247578{{!}}PasteCheck: Enable by default (T405127)]]
* 14:20 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 14:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P89707 and previous config saved to /var/cache/conftool/dbconfig/20260303-141610-marostegui.json
* 14:15 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247576{{!}}Enable Wikibase GraphQL on test.wikidata.org (T417619)]], [[gerrit:1247577{{!}}Enable Wikibase GraphQL on production wikidata.org (T417619)]] (duration: 08m 17s)
* 14:11 esanders@deploy2002: esanders, jakob: Continuing with sync
* 14:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89706 and previous config saved to /var/cache/conftool/dbconfig/20260303-141142-marostegui.json
* 14:09 esanders@deploy2002: esanders, jakob: Backport for [[gerrit:1247576{{!}}Enable Wikibase GraphQL on test.wikidata.org (T417619)]], [[gerrit:1247577{{!}}Enable Wikibase GraphQL on production wikidata.org (T417619)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1247576{{!}}Enable Wikibase GraphQL on test.wikidata.org (T417619)]], [[gerrit:1247577{{!}}Enable Wikibase GraphQL on production wikidata.org (T417619)]]
* 14:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P89704 and previous config saved to /var/cache/conftool/dbconfig/20260303-140102-marostegui.json
* 13:47 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1243 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89703 and previous config saved to /var/cache/conftool/dbconfig/20260303-134702-marostegui.json
* 13:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1243.eqiad.wmnet with reason: Maintenance
* 13:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89702 and previous config saved to /var/cache/conftool/dbconfig/20260303-134639-marostegui.json
* 13:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89701 and previous config saved to /var/cache/conftool/dbconfig/20260303-134554-marostegui.json
* 13:31 moritzm: installing NSS security updates
* 13:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P89700 and previous config saved to /var/cache/conftool/dbconfig/20260303-133131-marostegui.json
* 13:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89699 and previous config saved to /var/cache/conftool/dbconfig/20260303-132414-marostegui.json
* 13:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 13:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89698 and previous config saved to /var/cache/conftool/dbconfig/20260303-132350-marostegui.json
* 13:20 tappof: Thanos: re-enable querier<->ruler cross-site traffic [[phab:T412924|T412924]]
* 13:17 dpogorzelski@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=recommendation-api,name=eqiad
* 13:17 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P89697 and previous config saved to /var/cache/conftool/dbconfig/20260303-131624-marostegui.json
* 13:16 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 13:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1177.eqiad.wmnet
* 13:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1359.eqiad.wmnet with OS trixie
* 13:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P89696 and previous config saved to /var/cache/conftool/dbconfig/20260303-130842-marostegui.json
* 13:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89695 and previous config saved to /var/cache/conftool/dbconfig/20260303-130117-marostegui.json
* 13:01 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:00 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1177.eqiad.wmnet
* 13:00 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1176.eqiad.wmnet
* 12:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1358.eqiad.wmnet with OS trixie
* 12:56 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:55 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:53 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1359.eqiad.wmnet with reason: host reimage
* 12:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P89694 and previous config saved to /var/cache/conftool/dbconfig/20260303-125335-marostegui.json
* 12:52 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1357.eqiad.wmnet with OS trixie
* 12:51 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:50 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:48 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1359.eqiad.wmnet with reason: host reimage
* 12:48 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1356.eqiad.wmnet with OS trixie
* 12:47 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:47 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 12:47 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:47 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1176.eqiad.wmnet
* 12:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1175.eqiad.wmnet
* 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 12:45 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 12:45 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'readability' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 12:43 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1358.eqiad.wmnet with reason: host reimage
* 12:42 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 12:42 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 12:41 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:40 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247559{{!}}Enable thumb steps on private wikis too (T414805)]] (duration: 13m 01s)
* 12:39 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89693 and previous config saved to /var/cache/conftool/dbconfig/20260303-123827-marostegui.json
* 12:36 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1357.eqiad.wmnet with reason: host reimage
* 12:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1242 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89692 and previous config saved to /var/cache/conftool/dbconfig/20260303-123642-marostegui.json
* 12:36 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1359.eqiad.wmnet with OS trixie
* 12:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1242.eqiad.wmnet with reason: Maintenance
* 12:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89691 and previous config saved to /var/cache/conftool/dbconfig/20260303-123619-marostegui.json
* 12:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1175.eqiad.wmnet
* 12:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1174.eqiad.wmnet
* 12:34 dpogorzelski@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=recommendation-api,name=eqiad
* 12:33 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 12:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1356.eqiad.wmnet with reason: host reimage
* 12:31 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:31 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:31 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:31 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1247559{{!}}Enable thumb steps on private wikis too (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:30 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1358.eqiad.wmnet with reason: host reimage
* 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1357.eqiad.wmnet with reason: host reimage
* 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1356.eqiad.wmnet with reason: host reimage
* 12:27 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:27 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1247559{{!}}Enable thumb steps on private wikis too (T414805)]]
* 12:26 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1174.eqiad.wmnet
* 12:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1173.eqiad.wmnet
* 12:21 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P89690 and previous config saved to /var/cache/conftool/dbconfig/20260303-122112-marostegui.json
* 12:20 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:20 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:19 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1353.eqiad.wmnet with OS trixie
* 12:16 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1358.eqiad.wmnet with OS trixie
* 12:16 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1357.eqiad.wmnet with OS trixie
* 12:15 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1356.eqiad.wmnet with OS trixie
* 12:14 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1355.eqiad.wmnet with OS trixie
* 12:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89689 and previous config saved to /var/cache/conftool/dbconfig/20260303-121420-marostegui.json
* 12:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 12:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89688 and previous config saved to /var/cache/conftool/dbconfig/20260303-121355-marostegui.json
* 12:09 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1354.eqiad.wmnet with OS trixie
* 12:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1173.eqiad.wmnet
* 12:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1172.eqiad.wmnet
* 12:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P89687 and previous config saved to /var/cache/conftool/dbconfig/20260303-120604-marostegui.json
* 12:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1352.eqiad.wmnet with OS trixie
* 12:02 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1353.eqiad.wmnet with reason: host reimage
* 11:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P89686 and previous config saved to /var/cache/conftool/dbconfig/20260303-115847-marostegui.json
* 11:58 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1355.eqiad.wmnet with reason: host reimage
* 11:52 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1354.eqiad.wmnet with reason: host reimage
* 11:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89685 and previous config saved to /var/cache/conftool/dbconfig/20260303-115057-marostegui.json
* 11:48 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1352.eqiad.wmnet with reason: host reimage
* 11:44 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1355.eqiad.wmnet with reason: host reimage
* 11:43 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1354.eqiad.wmnet with reason: host reimage
* 11:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P89684 and previous config saved to /var/cache/conftool/dbconfig/20260303-114341-marostegui.json
* 11:43 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1353.eqiad.wmnet with reason: host reimage
* 11:42 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1352.eqiad.wmnet with reason: host reimage
* 11:40 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 11:36 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 11:31 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1355.eqiad.wmnet with OS trixie
* 11:31 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1354.eqiad.wmnet with OS trixie
* 11:30 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1353.eqiad.wmnet with OS trixie
* 11:30 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1352.eqiad.wmnet with OS trixie
* 11:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T418465|T418465]])', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260303-112828-marostegui.json
* 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1241 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89683 and previous config saved to /var/cache/conftool/dbconfig/20260303-112535-marostegui.json
* 11:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1241.eqiad.wmnet with reason: Maintenance
* 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89682 and previous config saved to /var/cache/conftool/dbconfig/20260303-112511-marostegui.json
* 11:21 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:18 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:18 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:17 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:17 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:16 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1350-1351].eqiad.wmnet
* 11:16 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1350-1351].eqiad.wmnet
* 11:15 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:15 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:15 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:15 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:14 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 11:14 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 11:13 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 11:13 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 11:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1172.eqiad.wmnet
* 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1171.eqiad.wmnet
* 11:13 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 11:13 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 11:12 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 11:11 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 11:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P89681 and previous config saved to /var/cache/conftool/dbconfig/20260303-111003-marostegui.json
* 11:09 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:08 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 11:08 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 11:07 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 11:07 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 11:06 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 11:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89680 and previous config saved to /var/cache/conftool/dbconfig/20260303-110551-marostegui.json
* 11:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2210.codfw.wmnet with reason: Maintenance
* 11:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89679 and previous config saved to /var/cache/conftool/dbconfig/20260303-110527-marostegui.json
* 10:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1171.eqiad.wmnet
* 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1170.eqiad.wmnet
* 10:57 slyngshede@dns1004: END - running authdns-update
* 10:55 slyngshede@dns1004: START - running authdns-update
* 10:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P89678 and previous config saved to /var/cache/conftool/dbconfig/20260303-105455-marostegui.json
* 10:54 hashar@deploy2002: Finished deploy [gerrit/gerrit@12177b1]: wm-checks-api: add tag for Selenium jobs (duration: 00m 13s)
* 10:54 hashar@deploy2002: Started deploy [gerrit/gerrit@12177b1]: wm-checks-api: add tag for Selenium jobs
* 10:51 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 10:51 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 10:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P89677 and previous config saved to /var/cache/conftool/dbconfig/20260303-105020-marostegui.json
* 10:47 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1170.eqiad.wmnet
* 10:45 fabfur: start upgrading haproxy to 3.0 on A:cp-eqsin ([[phab:T417253|T417253]])
* 10:43 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 10:41 moritzm: installing Django security updates
* 10:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89676 and previous config saved to /var/cache/conftool/dbconfig/20260303-103947-marostegui.json
* 10:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P89675 and previous config saved to /var/cache/conftool/dbconfig/20260303-103512-marostegui.json
* 10:34 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:33 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:31 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 10:25 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 10:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89674 and previous config saved to /var/cache/conftool/dbconfig/20260303-102004-marostegui.json
* 10:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1238 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89673 and previous config saved to /var/cache/conftool/dbconfig/20260303-101800-marostegui.json
* 10:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1238.eqiad.wmnet with reason: Maintenance
* 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89672 and previous config saved to /var/cache/conftool/dbconfig/20260303-101747-marostegui.json
* 09:57 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89670 and previous config saved to /var/cache/conftool/dbconfig/20260303-095655-marostegui.json
* 09:56 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 09:53 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:51 moritzm: installing qemu security updates
* 09:48 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 09:48 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 09:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P89669 and previous config saved to /var/cache/conftool/dbconfig/20260303-094732-marostegui.json
* 09:47 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 09:47 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 09:45 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 09:45 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 09:44 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 09:44 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 09:44 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 09:44 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1176.eqiad.wmnet with reason: host reimage
* 09:43 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 09:40 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 09:38 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1176.eqiad.wmnet with reason: host reimage
* 09:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2199.codfw.wmnet with reason: Maintenance
* 09:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89668 and previous config saved to /var/cache/conftool/dbconfig/20260303-093542-marostegui.json
* 09:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89667 and previous config saved to /var/cache/conftool/dbconfig/20260303-093224-marostegui.json
* 09:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 09:23 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 09:23 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1176.eqiad.wmnet with OS trixie
* 09:21 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 09:20 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 09:20 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 09:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P89666 and previous config saved to /var/cache/conftool/dbconfig/20260303-092034-marostegui.json
* 09:19 arnaudb@dns1004: END - running authdns-update
* 09:18 arnaudb@dns1004: START - running authdns-update
* 09:17 moritzm: installing libbpf updates from Bookworm point release
* 09:08 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89665 and previous config saved to /var/cache/conftool/dbconfig/20260303-090818-marostegui.json
* 09:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on 6 hosts with reason: Maintenance
* 09:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1221.eqiad.wmnet with reason: Maintenance
* 09:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89664 and previous config saved to /var/cache/conftool/dbconfig/20260303-090731-marostegui.json
* 09:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P89663 and previous config saved to /var/cache/conftool/dbconfig/20260303-090526-marostegui.json
* 08:54 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
* 08:53 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
* 08:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P89662 and previous config saved to /var/cache/conftool/dbconfig/20260303-085224-marostegui.json
* 08:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89661 and previous config saved to /var/cache/conftool/dbconfig/20260303-085019-marostegui.json
* 08:47 moritzm: powercycling lvs1013
* 08:41 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 08:41 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 08:37 fabfur: start upgrading haproxy to 3.0 on A:cp-ulsfo ([[phab:T417253|T417253]])
* 08:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P89660 and previous config saved to /var/cache/conftool/dbconfig/20260303-083716-marostegui.json
* 08:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 08:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:31 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 08:30 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 08:28 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
* 08:27 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
* 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89659 and previous config saved to /var/cache/conftool/dbconfig/20260303-082424-marostegui.json
* 08:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89658 and previous config saved to /var/cache/conftool/dbconfig/20260303-082400-marostegui.json
* 08:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89657 and previous config saved to /var/cache/conftool/dbconfig/20260303-082209-marostegui.json
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P89656 and previous config saved to /var/cache/conftool/dbconfig/20260303-080853-marostegui.json
* 08:07 moritzm: installing PAM security updates on Bookworm
* 07:55 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1199 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89655 and previous config saved to /var/cache/conftool/dbconfig/20260303-075526-marostegui.json
* 07:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1199.eqiad.wmnet with reason: Maintenance
* 07:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89654 and previous config saved to /var/cache/conftool/dbconfig/20260303-075502-marostegui.json
* 07:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P89653 and previous config saved to /var/cache/conftool/dbconfig/20260303-075345-marostegui.json
* 07:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P89652 and previous config saved to /var/cache/conftool/dbconfig/20260303-073955-marostegui.json
* 07:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89651 and previous config saved to /var/cache/conftool/dbconfig/20260303-073838-marostegui.json
* 07:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P89650 and previous config saved to /var/cache/conftool/dbconfig/20260303-072447-marostegui.json
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89649 and previous config saved to /var/cache/conftool/dbconfig/20260303-071054-marostegui.json
* 07:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89648 and previous config saved to /var/cache/conftool/dbconfig/20260303-071029-marostegui.json
* 07:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89647 and previous config saved to /var/cache/conftool/dbconfig/20260303-070940-marostegui.json
* 06:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P89646 and previous config saved to /var/cache/conftool/dbconfig/20260303-065523-marostegui.json
* 06:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1190 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89645 and previous config saved to /var/cache/conftool/dbconfig/20260303-064405-marostegui.json
* 06:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P89644 and previous config saved to /var/cache/conftool/dbconfig/20260303-064015-marostegui.json
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2240 gradually with 4 steps - repool after schema change
* 06:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89642 and previous config saved to /var/cache/conftool/dbconfig/20260303-062507-marostegui.json
* 05:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2147 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89639 and previous config saved to /var/cache/conftool/dbconfig/20260303-055834-marostegui.json
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2147.codfw.wmnet with reason: Maintenance
* 05:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 05:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 05:48 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2240 gradually with 4 steps - repool after schema change
* 05:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 05:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.15 (duration: 01m 10s)
* 04:43 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.18 refs [[phab:T413809|T413809]] (duration: 39m 43s)
* 04:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 03:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 03:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89637 and previous config saved to /var/cache/conftool/dbconfig/20260303-035746-marostegui.json
* 03:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P89636 and previous config saved to /var/cache/conftool/dbconfig/20260303-034239-marostegui.json
* 03:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P89635 and previous config saved to /var/cache/conftool/dbconfig/20260303-032731-marostegui.json
* 03:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89634 and previous config saved to /var/cache/conftool/dbconfig/20260303-031224-marostegui.json
* 03:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1251 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89633 and previous config saved to /var/cache/conftool/dbconfig/20260303-030217-marostegui.json
* 03:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1251.eqiad.wmnet with reason: Maintenance
* 02:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1240.eqiad.wmnet with reason: Maintenance
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 00s)
* 02:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 02:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89632 and previous config saved to /var/cache/conftool/dbconfig/20260303-020817-marostegui.json
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P89631 and previous config saved to /var/cache/conftool/dbconfig/20260303-015309-marostegui.json
* 01:42 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mwlog2003.codfw.wmnet with OS trixie
* 01:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P89630 and previous config saved to /var/cache/conftool/dbconfig/20260303-013802-marostegui.json
* 01:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89629 and previous config saved to /var/cache/conftool/dbconfig/20260303-013719-marostegui.json
* 01:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89628 and previous config saved to /var/cache/conftool/dbconfig/20260303-012254-marostegui.json
* 01:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P89627 and previous config saved to /var/cache/conftool/dbconfig/20260303-012211-marostegui.json
* 01:19 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwlog2003.codfw.wmnet with reason: host reimage
* 01:11 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mwlog2003.codfw.wmnet with reason: host reimage
* 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1235 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89626 and previous config saved to /var/cache/conftool/dbconfig/20260303-011151-marostegui.json
* 01:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1235.eqiad.wmnet with reason: Maintenance
* 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89625 and previous config saved to /var/cache/conftool/dbconfig/20260303-011128-marostegui.json
* 01:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P89624 and previous config saved to /var/cache/conftool/dbconfig/20260303-010703-marostegui.json
* 00:59 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247189{{!}}Revert "ImageListPager: Properly support file schema migration read new"]] (duration: 08m 12s)
* 00:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P89623 and previous config saved to /var/cache/conftool/dbconfig/20260303-005620-marostegui.json
* 00:56 zabe@deploy2002: zabe: Continuing with sync
* 00:54 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog2003.codfw.wmnet with OS trixie
* 00:53 zabe@deploy2002: zabe: Backport for [[gerrit:1247189{{!}}Revert "ImageListPager: Properly support file schema migration read new"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mwlog2003.codfw.wmnet with OS trixie
* 00:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89622 and previous config saved to /var/cache/conftool/dbconfig/20260303-005156-marostegui.json
* 00:51 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1247189{{!}}Revert "ImageListPager: Properly support file schema migration read new"]]
* 00:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P89621 and previous config saved to /var/cache/conftool/dbconfig/20260303-004112-marostegui.json
* 00:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2216 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89620 and previous config saved to /var/cache/conftool/dbconfig/20260303-004056-marostegui.json
* 00:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2216.codfw.wmnet with reason: Maintenance
* 00:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89619 and previous config saved to /var/cache/conftool/dbconfig/20260303-004033-marostegui.json
* 00:31 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mwlog1003.eqiad.wmnet with OS trixie
* 00:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89618 and previous config saved to /var/cache/conftool/dbconfig/20260303-002604-marostegui.json
* 00:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P89617 and previous config saved to /var/cache/conftool/dbconfig/20260303-002525-marostegui.json
* 00:20 zabe@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 00:18 zabe@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 00:18 zabe@deploy2002: Finished scap sync-world: [[phab:T418327|T418327]] (duration: 05m 01s)
* 00:15 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1234 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89616 and previous config saved to /var/cache/conftool/dbconfig/20260303-001504-marostegui.json
* 00:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1234.eqiad.wmnet with reason: Maintenance
* 00:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89615 and previous config saved to /var/cache/conftool/dbconfig/20260303-001440-marostegui.json
* 00:13 zabe@deploy2002: Started scap sync-world: [[phab:T418327|T418327]]
* 00:11 zabe@deploy2002: zabe: Continuing with sync
* 00:10 zabe@deploy2002: zabe: Backport for [[gerrit:1247068{{!}}ImageListPager: Properly support file schema migration read new (T418327)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P89614 and previous config saved to /var/cache/conftool/dbconfig/20260303-001018-marostegui.json
* 00:08 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1247068{{!}}ImageListPager: Properly support file schema migration read new (T418327)]]
== 2026-03-02 ==
* 23:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P89613 and previous config saved to /var/cache/conftool/dbconfig/20260302-235933-marostegui.json
* 23:58 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240320{{!}}Stop writing to il_to on testwiki (T415787)]] (duration: 06m 02s)
* 23:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89612 and previous config saved to /var/cache/conftool/dbconfig/20260302-235511-marostegui.json
* 23:54 zabe@deploy2002: zabe: Continuing with sync
* 23:53 zabe@deploy2002: zabe: Backport for [[gerrit:1240320{{!}}Stop writing to il_to on testwiki (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:52 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1240320{{!}}Stop writing to il_to on testwiki (T415787)]]
* 23:51 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp2058.codfw.wmnet with reason: dcops troubleshooting for [[phab:T418527|T418527]]
* 23:50 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1246880{{!}}multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460)]] (duration: 07m 10s)
* 23:47 zabe@deploy2002: zabe: Continuing with sync
* 23:45 zabe@deploy2002: zabe: Backport for [[gerrit:1246880{{!}}multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P89611 and previous config saved to /var/cache/conftool/dbconfig/20260302-234425-marostegui.json
* 23:44 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog2003.codfw.wmnet with OS trixie
* 23:43 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89610 and previous config saved to /var/cache/conftool/dbconfig/20260302-234350-marostegui.json
* 23:43 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1246880{{!}}multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460)]]
* 23:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2203.codfw.wmnet with reason: Maintenance
* 23:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2202.codfw.wmnet with reason: Maintenance
* 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89609 and previous config saved to /var/cache/conftool/dbconfig/20260302-233517-marostegui.json
* 23:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89608 and previous config saved to /var/cache/conftool/dbconfig/20260302-232918-marostegui.json
* 23:25 dwisehaupt@dns1006: END - running authdns-update
* 23:24 dwisehaupt@dns1006: START - running authdns-update
* 23:23 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwlog1003.eqiad.wmnet with reason: host reimage
* 23:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P89607 and previous config saved to /var/cache/conftool/dbconfig/20260302-232009-marostegui.json
* 23:18 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mwlog1003.eqiad.wmnet with reason: host reimage
* 23:17 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1232 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89606 and previous config saved to /var/cache/conftool/dbconfig/20260302-231723-marostegui.json
* 23:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1232.eqiad.wmnet with reason: Maintenance
* 23:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89605 and previous config saved to /var/cache/conftool/dbconfig/20260302-231658-marostegui.json
* 23:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P89604 and previous config saved to /var/cache/conftool/dbconfig/20260302-230502-marostegui.json
* 23:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P89603 and previous config saved to /var/cache/conftool/dbconfig/20260302-230151-marostegui.json
* 22:57 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog1003.eqiad.wmnet with OS trixie
* 22:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89602 and previous config saved to /var/cache/conftool/dbconfig/20260302-224954-marostegui.json
* 22:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P89601 and previous config saved to /var/cache/conftool/dbconfig/20260302-224643-marostegui.json
* 22:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2188 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89600 and previous config saved to /var/cache/conftool/dbconfig/20260302-223612-marostegui.json
* 22:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2188.codfw.wmnet with reason: Maintenance
* 22:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89599 and previous config saved to /var/cache/conftool/dbconfig/20260302-223548-marostegui.json
* 22:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89598 and previous config saved to /var/cache/conftool/dbconfig/20260302-223135-marostegui.json
* 22:21 maryum: Deployed security fix for [[phab:T418179|T418179]]
* 22:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P89597 and previous config saved to /var/cache/conftool/dbconfig/20260302-222041-marostegui.json
* 22:19 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89596 and previous config saved to /var/cache/conftool/dbconfig/20260302-221938-marostegui.json
* 22:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 22:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89595 and previous config saved to /var/cache/conftool/dbconfig/20260302-221925-marostegui.json
* 22:10 aaron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1242613{{!}}Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470)]] (duration: 06m 39s)
* 22:06 aaron@deploy2002: aaron: Continuing with sync
* 22:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P89594 and previous config saved to /var/cache/conftool/dbconfig/20260302-220533-marostegui.json
* 22:05 aaron@deploy2002: aaron: Backport for [[gerrit:1242613{{!}}Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P89593 and previous config saved to /var/cache/conftool/dbconfig/20260302-220418-marostegui.json
* 22:03 aaron@deploy2002: Started scap sync-world: Backport for [[gerrit:1242613{{!}}Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470)]]
* 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-backup2003.codfw.wmnet with OS trixie
* 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-backup2004.codfw.wmnet with OS trixie
* 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 22:03 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 22:02 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 22:01 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247149{{!}}ApiCSPReport: Use structured logging for CSP reports]] (duration: 08m 19s)
* 21:57 catrope@deploy2002: catrope: Continuing with sync
* 21:56 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 21:55 catrope@deploy2002: catrope: Backport for [[gerrit:1247149{{!}}ApiCSPReport: Use structured logging for CSP reports]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:53 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1247149{{!}}ApiCSPReport: Use structured logging for CSP reports]]
* 21:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89592 and previous config saved to /var/cache/conftool/dbconfig/20260302-215025-marostegui.json
* 21:50 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2043.codfw.wmnet with reason: These are test instances, failing should not notif
* 21:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P89591 and previous config saved to /var/cache/conftool/dbconfig/20260302-214910-marostegui.json
* 21:48 inflatador: bking@desktop restarting wdqs codfw to clear ProbeDown alerts
* 21:43 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cp2043.codfw.wmnet
* 21:42 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup2004.codfw.wmnet with reason: host reimage
* 21:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89590 and previous config saved to /var/cache/conftool/dbconfig/20260302-213957-marostegui.json
* 21:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 21:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89589 and previous config saved to /var/cache/conftool/dbconfig/20260302-213934-marostegui.json
* 21:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup2003.codfw.wmnet with reason: host reimage
* 21:36 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Testing removal of OpenJDK 8 support - eevans@cumin1003
* 21:34 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1226024{{!}}Add Comments namespace for shnwikinews (T414403)]] (duration: 07m 07s)
* 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89588 and previous config saved to /var/cache/conftool/dbconfig/20260302-213402-marostegui.json
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup2004.codfw.wmnet with reason: host reimage
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup2003.codfw.wmnet with reason: host reimage
* 21:30 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp2043.codfw.wmnet
* 21:30 catrope@deploy2002: shivaanshsingh, catrope: Continuing with sync
* 21:29 catrope@deploy2002: shivaanshsingh, catrope: Backport for [[gerrit:1226024{{!}}Add Comments namespace for shnwikinews (T414403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:27 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1226024{{!}}Add Comments namespace for shnwikinews (T414403)]]
* 21:24 kemayo@deploy2002: Finished scap sync-world: Backport for [[gerrit:1243990{{!}}Suggestion Mode: add values for suggestion feedback properties (T401739)]], [[gerrit:1240721{{!}}Stop PasteCheck A/B test (T417429)]] (duration: 10m 55s)
* 21:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P89587 and previous config saved to /var/cache/conftool/dbconfig/20260302-212426-marostegui.json
* 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89586 and previous config saved to /var/cache/conftool/dbconfig/20260302-212345-marostegui.json
* 21:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89585 and previous config saved to /var/cache/conftool/dbconfig/20260302-212321-marostegui.json
* 21:20 kemayo@deploy2002: esanders, kemayo, caro: Continuing with sync
* 21:17 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-backup2004.codfw.wmnet with OS trixie
* 21:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-backup2003.codfw.wmnet with OS trixie
* 21:16 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Testing removal of OpenJDK 8 support - eevans@cumin1003
* 21:16 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ms-backup2003']
* 21:15 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-backup2003']
* 21:15 kemayo@deploy2002: esanders, kemayo, caro: Backport for [[gerrit:1243990{{!}}Suggestion Mode: add values for suggestion feedback properties (T401739)]], [[gerrit:1240721{{!}}Stop PasteCheck A/B test (T417429)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:14 inflatador: bking@apt1002 reprepro --component thirdparty/opensearch3 update trixie-wikimedia [[phab:T418388|T418388]]
* 21:13 kemayo@deploy2002: Started scap sync-world: Backport for [[gerrit:1243990{{!}}Suggestion Mode: add values for suggestion feedback properties (T401739)]], [[gerrit:1240721{{!}}Stop PasteCheck A/B test (T417429)]]
* 21:12 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-backup2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:11 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-backup2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:10 dani@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247107{{!}}Undeploy Comparative Reader Research survey on eswiki (T417834)]], [[gerrit:1247105{{!}}Undeploy Comparative Reader Research survey on enwiki (T417829)]] (duration: 06m 52s)
* 21:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P89584 and previous config saved to /var/cache/conftool/dbconfig/20260302-210919-marostegui.json
* 21:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P89583 and previous config saved to /var/cache/conftool/dbconfig/20260302-210813-marostegui.json
* 21:06 dani@deploy2002: dani: Continuing with sync
* 21:05 dani@deploy2002: dani: Backport for [[gerrit:1247107{{!}}Undeploy Comparative Reader Research survey on eswiki (T417834)]], [[gerrit:1247105{{!}}Undeploy Comparative Reader Research survey on enwiki (T417829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:03 dani@deploy2002: Started scap sync-world: Backport for [[gerrit:1247107{{!}}Undeploy Comparative Reader Research survey on eswiki (T417834)]], [[gerrit:1247105{{!}}Undeploy Comparative Reader Research survey on enwiki (T417829)]]
* 20:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-backup2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-backup2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-backup2004
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-backup2004
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-backup2003
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-backup2003
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-backup2003 to codfw - jhancock@cumin2002"
* 20:54 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-backup2003 to codfw - jhancock@cumin2002"
* 20:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89582 and previous config saved to /var/cache/conftool/dbconfig/20260302-205411-marostegui.json
* 20:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P89581 and previous config saved to /var/cache/conftool/dbconfig/20260302-205307-marostegui.json
* 20:50 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 20:42 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89580 and previous config saved to /var/cache/conftool/dbconfig/20260302-204136-marostegui.json
* 20:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 20:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89579 and previous config saved to /var/cache/conftool/dbconfig/20260302-204112-marostegui.json
* 20:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:39 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89578 and previous config saved to /var/cache/conftool/dbconfig/20260302-203759-marostegui.json
* 20:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:31 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89577 and previous config saved to /var/cache/conftool/dbconfig/20260302-202740-marostegui.json
* 20:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 20:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89576 and previous config saved to /var/cache/conftool/dbconfig/20260302-202716-marostegui.json
* 20:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P89575 and previous config saved to /var/cache/conftool/dbconfig/20260302-202604-marostegui.json
* 20:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P89574 and previous config saved to /var/cache/conftool/dbconfig/20260302-201209-marostegui.json
* 20:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P89573 and previous config saved to /var/cache/conftool/dbconfig/20260302-201057-marostegui.json
* 20:01 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 20:00 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 19:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P89572 and previous config saved to /var/cache/conftool/dbconfig/20260302-195702-marostegui.json
* 19:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89571 and previous config saved to /var/cache/conftool/dbconfig/20260302-195549-marostegui.json
* 19:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89570 and previous config saved to /var/cache/conftool/dbconfig/20260302-194435-marostegui.json
* 19:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 19:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89569 and previous config saved to /var/cache/conftool/dbconfig/20260302-194411-marostegui.json
* 19:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89568 and previous config saved to /var/cache/conftool/dbconfig/20260302-194155-marostegui.json
* 19:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89566 and previous config saved to /var/cache/conftool/dbconfig/20260302-193119-marostegui.json
* 19:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 19:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 19:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89565 and previous config saved to /var/cache/conftool/dbconfig/20260302-193046-marostegui.json
* 19:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P89564 and previous config saved to /var/cache/conftool/dbconfig/20260302-192903-marostegui.json
* 19:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P89563 and previous config saved to /var/cache/conftool/dbconfig/20260302-191539-marostegui.json
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P89562 and previous config saved to /var/cache/conftool/dbconfig/20260302-191355-marostegui.json
* 19:12 dzahn@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:12 dzahn@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:04 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2095.codfw.wmnet with OS bullseye
* 19:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P89561 and previous config saved to /var/cache/conftool/dbconfig/20260302-190032-marostegui.json
* 18:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89560 and previous config saved to /var/cache/conftool/dbconfig/20260302-185848-marostegui.json
* 18:54 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 18:53 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89559 and previous config saved to /var/cache/conftool/dbconfig/20260302-184832-marostegui.json
* 18:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89558 and previous config saved to /var/cache/conftool/dbconfig/20260302-184808-marostegui.json
* 18:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89557 and previous config saved to /var/cache/conftool/dbconfig/20260302-184524-marostegui.json
* 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89556 and previous config saved to /var/cache/conftool/dbconfig/20260302-183449-marostegui.json
* 18:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1195.eqiad.wmnet with reason: Maintenance
* 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89555 and previous config saved to /var/cache/conftool/dbconfig/20260302-183425-marostegui.json
* 18:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P89554 and previous config saved to /var/cache/conftool/dbconfig/20260302-183300-marostegui.json
* 18:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P89553 and previous config saved to /var/cache/conftool/dbconfig/20260302-181918-marostegui.json
* 18:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P89552 and previous config saved to /var/cache/conftool/dbconfig/20260302-181753-marostegui.json
* 18:16 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 18:04 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P89551 and previous config saved to /var/cache/conftool/dbconfig/20260302-180411-marostegui.json
* 18:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89550 and previous config saved to /var/cache/conftool/dbconfig/20260302-180245-marostegui.json
* 18:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:53 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
* 17:53 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
* 17:52 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:52 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89549 and previous config saved to /var/cache/conftool/dbconfig/20260302-174917-marostegui.json
* 17:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 17:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 17:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89548 and previous config saved to /var/cache/conftool/dbconfig/20260302-174903-marostegui.json
* 17:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89547 and previous config saved to /var/cache/conftool/dbconfig/20260302-174854-marostegui.json
* 17:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
* 17:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:42 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:39 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89546 and previous config saved to /var/cache/conftool/dbconfig/20260302-173827-marostegui.json
* 17:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89545 and previous config saved to /var/cache/conftool/dbconfig/20260302-173803-marostegui.json
* 17:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 17:36 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 17:34 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 17:33 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 17:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P89544 and previous config saved to /var/cache/conftool/dbconfig/20260302-173347-marostegui.json
* 17:32 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.update-replication (exit_code=99)
* 17:32 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 17:24 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 17:23 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 17:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P89543 and previous config saved to /var/cache/conftool/dbconfig/20260302-172256-marostegui.json
* 17:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P89542 and previous config saved to /var/cache/conftool/dbconfig/20260302-171839-marostegui.json
* 17:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P89541 and previous config saved to /var/cache/conftool/dbconfig/20260302-170748-marostegui.json
* 17:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89540 and previous config saved to /var/cache/conftool/dbconfig/20260302-170331-marostegui.json
* 16:52 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2230.codfw.wmnet with OS trixie
* 16:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89539 and previous config saved to /var/cache/conftool/dbconfig/20260302-165240-marostegui.json
* 16:51 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2146 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89538 and previous config saved to /var/cache/conftool/dbconfig/20260302-165153-marostegui.json
* 16:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2146.codfw.wmnet with reason: Maintenance
* 16:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89537 and previous config saved to /var/cache/conftool/dbconfig/20260302-165129-marostegui.json
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89536 and previous config saved to /var/cache/conftool/dbconfig/20260302-164141-marostegui.json
* 16:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89535 and previous config saved to /var/cache/conftool/dbconfig/20260302-164118-marostegui.json
* 16:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P89534 and previous config saved to /var/cache/conftool/dbconfig/20260302-163622-marostegui.json
* 16:29 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2230.codfw.wmnet with reason: host reimage
* 16:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P89533 and previous config saved to /var/cache/conftool/dbconfig/20260302-162610-marostegui.json
* 16:21 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2230.codfw.wmnet with reason: host reimage
* 16:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P89532 and previous config saved to /var/cache/conftool/dbconfig/20260302-162115-marostegui.json
* 16:19 moritzm: installing PAM security updates on Bookworm
* 16:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P89531 and previous config saved to /var/cache/conftool/dbconfig/20260302-161102-marostegui.json
* 16:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89530 and previous config saved to /var/cache/conftool/dbconfig/20260302-160607-marostegui.json
* 16:05 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db2230.codfw.wmnet with OS trixie
* 15:56 moritzm: installing glibc bugfix updates from trixie point release
* 15:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89529 and previous config saved to /var/cache/conftool/dbconfig/20260302-155555-marostegui.json
* 15:55 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2145 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89528 and previous config saved to /var/cache/conftool/dbconfig/20260302-155527-marostegui.json
* 15:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2145.codfw.wmnet with reason: Maintenance
* 15:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 15:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1169.eqiad.wmnet
* 15:45 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89527 and previous config saved to /var/cache/conftool/dbconfig/20260302-154520-marostegui.json
* 15:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 15:38 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1169.eqiad.wmnet
* 15:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1167.eqiad.wmnet
* 15:32 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 15:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1209.eqiad.wmnet with reason: Maintenance
* 15:31 marostegui@cumin1003: dbctl commit (dc=all): 'Restore db1226 full weight after schema change', diff saved to https://phabricator.wikimedia.org/P89526 and previous config saved to /var/cache/conftool/dbconfig/20260302-153100-marostegui.json
* 15:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P89525 and previous config saved to /var/cache/conftool/dbconfig/20260302-152334-marostegui.json
* 15:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1167.eqiad.wmnet
* 15:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1166.eqiad.wmnet
* 15:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2198.codfw.wmnet with reason: Maintenance
* 15:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89524 and previous config saved to /var/cache/conftool/dbconfig/20260302-151838-marostegui.json
* 15:10 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1166.eqiad.wmnet
* 15:10 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1165.eqiad.wmnet
* 15:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P89523 and previous config saved to /var/cache/conftool/dbconfig/20260302-150826-marostegui.json
* 15:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P89522 and previous config saved to /var/cache/conftool/dbconfig/20260302-150330-marostegui.json
* 15:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1097.eqiad.wmnet with OS bullseye
* 15:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1165.eqiad.wmnet
* 14:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1164.eqiad.wmnet
* 14:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89520 and previous config saved to /var/cache/conftool/dbconfig/20260302-145318-marostegui.json
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1164.eqiad.wmnet
* 14:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1163.eqiad.wmnet
* 14:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P89519 and previous config saved to /var/cache/conftool/dbconfig/20260302-144823-marostegui.json
* 14:41 Lucas_WMDE: UTC afternoon backport+config window done
* 14:40 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247057{{!}}IPInfo: Set log level to "info" (T374718)]] (duration: 08m 01s)
* 14:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1163.eqiad.wmnet
* 14:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1162.eqiad.wmnet
* 14:36 lucaswerkmeister-wmde@deploy2002: kharlan, lucaswerkmeister-wmde: Continuing with sync
* 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1226 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89517 and previous config saved to /var/cache/conftool/dbconfig/20260302-143608-marostegui.json
* 14:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1226.eqiad.wmnet with reason: Maintenance
* 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89516 and previous config saved to /var/cache/conftool/dbconfig/20260302-143544-marostegui.json
* 14:34 lucaswerkmeister-wmde@deploy2002: kharlan, lucaswerkmeister-wmde: Backport for [[gerrit:1247057{{!}}IPInfo: Set log level to "info" (T374718)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89515 and previous config saved to /var/cache/conftool/dbconfig/20260302-143315-marostegui.json
* 14:32 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [[gerrit:1247057{{!}}IPInfo: Set log level to "info" (T374718)]]
* 14:31 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 14:30 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [[gerrit:1245364{{!}}Add configurations for graphql usage survey and its pipeline tests (T414476)]] (duration: 09m 44s)
* 14:27 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:26 lucaswerkmeister-wmde@deploy2002: itamar, lucaswerkmeister-wmde: Continuing with sync
* 14:26 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 14:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1162.eqiad.wmnet
* 14:25 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1161.eqiad.wmnet
* 14:23 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 14:22 lucaswerkmeister-wmde@deploy2002: itamar, lucaswerkmeister-wmde: Backport for [[gerrit:1245364{{!}}Add configurations for graphql usage survey and its pipeline tests (T414476)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:20 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [[gerrit:1245364{{!}}Add configurations for graphql usage survey and its pipeline tests (T414476)]]
* 14:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P89514 and previous config saved to /var/cache/conftool/dbconfig/20260302-142037-marostegui.json
* 14:19 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:18 lucaswerkmeister-wmde@deploy2002: mwscript-k8s job started: namespaceDupes lawiki --fix # [[phab:T418706|T418706]]
* 14:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89513 and previous config saved to /var/cache/conftool/dbconfig/20260302-141834-marostegui.json
* 14:18 elukey@puppetserver1001: conftool action : set/pooled=no; selector: name=ms-fe1013.eqiad.wmnet
* 14:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2195.codfw.wmnet with reason: Maintenance
* 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
* 14:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89512 and previous config saved to /var/cache/conftool/dbconfig/20260302-141810-marostegui.json
* 14:17 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247063{{!}}lawiki: add Adumbratio (draft) namespace (T418706)]] (duration: 07m 27s)
* 14:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
* 14:13 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, anzx: Continuing with sync
* 14:13 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 14:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1161.eqiad.wmnet
* 14:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1160.eqiad.wmnet
* 14:13 moritzm: installing libcap2 updates from Trixie point release
* 14:12 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, anzx: Backport for [[gerrit:1247063{{!}}lawiki: add Adumbratio (draft) namespace (T418706)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:11 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:11 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 14:10 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:10 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [[gerrit:1247063{{!}}lawiki: add Adumbratio (draft) namespace (T418706)]]
* 14:10 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 14:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1028.eqiad.wmnet
* 14:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 14:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P89511 and previous config saved to /var/cache/conftool/dbconfig/20260302-140529-marostegui.json
* 14:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1097.eqiad.wmnet with reason: host reimage
* 14:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P89510 and previous config saved to /var/cache/conftool/dbconfig/20260302-140302-marostegui.json
* 14:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1028.eqiad.wmnet
* 14:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1160.eqiad.wmnet
* 14:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 14:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1159.eqiad.wmnet
* 14:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1025.eqiad.wmnet
* 13:57 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1097.eqiad.wmnet with reason: host reimage
* 13:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1025.eqiad.wmnet
* 13:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89509 and previous config saved to /var/cache/conftool/dbconfig/20260302-135021-marostegui.json
* 13:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P89508 and previous config saved to /var/cache/conftool/dbconfig/20260302-134754-marostegui.json
* 13:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1159.eqiad.wmnet
* 13:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1158.eqiad.wmnet
* 13:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1097.eqiad.wmnet with OS bullseye
* 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1097
* 13:38 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1097
* 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt ms-be1097 - jclark@cumin1003"
* 13:37 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt ms-be1097 - jclark@cumin1003"
* 13:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1158.eqiad.wmnet
* 13:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1157.eqiad.wmnet
* 13:35 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 13:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1214 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89507 and previous config saved to /var/cache/conftool/dbconfig/20260302-133503-marostegui.json
* 13:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1214.eqiad.wmnet with reason: Maintenance
* 13:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89506 and previous config saved to /var/cache/conftool/dbconfig/20260302-133440-marostegui.json
* 13:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89505 and previous config saved to /var/cache/conftool/dbconfig/20260302-133247-marostegui.json
* 13:28 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 13:27 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:27 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1097
* 13:26 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1097
* 13:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1157.eqiad.wmnet
* 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1156.eqiad.wmnet
* 13:22 brouberol: Running `echo 'https://turnilo-next.wikimedia.org' {{!}} mwscript-k8s --attach -- purgeList.php`
* 13:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P89504 and previous config saved to /var/cache/conftool/dbconfig/20260302-131932-marostegui.json
* 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2181 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89503 and previous config saved to /var/cache/conftool/dbconfig/20260302-131653-marostegui.json
* 13:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2181.codfw.wmnet with reason: Maintenance
* 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89502 and previous config saved to /var/cache/conftool/dbconfig/20260302-131630-marostegui.json
* 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1024.eqiad.wmnet
* 13:14 moritzm: installing libcap2 updates from Bookworm point release
* 13:12 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1156.eqiad.wmnet
* 13:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1155.eqiad.wmnet
* 13:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1024.eqiad.wmnet
* 13:07 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 13:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P89500 and previous config saved to /var/cache/conftool/dbconfig/20260302-130424-marostegui.json
* 13:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P89499 and previous config saved to /var/cache/conftool/dbconfig/20260302-130122-marostegui.json
* 13:00 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 12:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2356.codfw.wmnet
* 12:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2356.codfw.wmnet
* 12:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1155.eqiad.wmnet
* 12:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1154.eqiad.wmnet
* 12:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89498 and previous config saved to /var/cache/conftool/dbconfig/20260302-124917-marostegui.json
* 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1154.eqiad.wmnet
* 12:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1153.eqiad.wmnet
* 12:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P89497 and previous config saved to /var/cache/conftool/dbconfig/20260302-124615-marostegui.json
* 12:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1153.eqiad.wmnet
* 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1152.eqiad.wmnet
* 12:32 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89494 and previous config saved to /var/cache/conftool/dbconfig/20260302-123253-marostegui.json
* 12:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1203.eqiad.wmnet with reason: Maintenance
* 12:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89493 and previous config saved to /var/cache/conftool/dbconfig/20260302-123229-marostegui.json
* 12:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89492 and previous config saved to /var/cache/conftool/dbconfig/20260302-123108-marostegui.json
* 12:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1152.eqiad.wmnet
* 12:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1151.eqiad.wmnet
* 12:23 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P89491 and previous config saved to /var/cache/conftool/dbconfig/20260302-121722-marostegui.json
* 12:15 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89490 and previous config saved to /var/cache/conftool/dbconfig/20260302-121525-marostegui.json
* 12:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 12:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89489 and previous config saved to /var/cache/conftool/dbconfig/20260302-121501-marostegui.json
* 12:12 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1151.eqiad.wmnet
* 12:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1150.eqiad.wmnet
* 12:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P89488 and previous config saved to /var/cache/conftool/dbconfig/20260302-120214-marostegui.json
* 12:00 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1150.eqiad.wmnet
* 11:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P89487 and previous config saved to /var/cache/conftool/dbconfig/20260302-115953-marostegui.json
* 11:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89486 and previous config saved to /var/cache/conftool/dbconfig/20260302-114706-marostegui.json
* 11:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P89485 and previous config saved to /var/cache/conftool/dbconfig/20260302-114446-marostegui.json
* 11:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1193 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89484 and previous config saved to /var/cache/conftool/dbconfig/20260302-113034-marostegui.json
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
* 11:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1193.eqiad.wmnet with reason: Maintenance
* 11:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89483 and previous config saved to /var/cache/conftool/dbconfig/20260302-113010-marostegui.json
* 11:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89482 and previous config saved to /var/cache/conftool/dbconfig/20260302-112937-marostegui.json
* 11:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
* 11:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P89481 and previous config saved to /var/cache/conftool/dbconfig/20260302-111502-marostegui.json
* 11:13 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2166 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89480 and previous config saved to /var/cache/conftool/dbconfig/20260302-111351-marostegui.json
* 11:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2166.codfw.wmnet with reason: Maintenance
* 11:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89479 and previous config saved to /var/cache/conftool/dbconfig/20260302-111327-marostegui.json
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 10:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P89478 and previous config saved to /var/cache/conftool/dbconfig/20260302-105955-marostegui.json
* 10:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P89477 and previous config saved to /var/cache/conftool/dbconfig/20260302-105818-marostegui.json
* 10:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 10:55 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:54 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 10:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 10:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 10:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 10:46 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru and A:cp - 3.0 upgrade ()
* 10:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89476 and previous config saved to /var/cache/conftool/dbconfig/20260302-104446-marostegui.json
* 10:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P89475 and previous config saved to /var/cache/conftool/dbconfig/20260302-104310-marostegui.json
* 10:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89474 and previous config saved to /var/cache/conftool/dbconfig/20260302-102825-marostegui.json
* 10:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1192.eqiad.wmnet with reason: Maintenance
* 10:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89473 and previous config saved to /var/cache/conftool/dbconfig/20260302-102800-marostegui.json
* 10:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P89472 and previous config saved to /var/cache/conftool/dbconfig/20260302-101252-marostegui.json
* 10:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2164 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89471 and previous config saved to /var/cache/conftool/dbconfig/20260302-101200-marostegui.json
* 10:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2164.codfw.wmnet with reason: Maintenance
* 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89470 and previous config saved to /var/cache/conftool/dbconfig/20260302-101135-marostegui.json
* 10:08 moritzm: installing intel-microcode bugfix updates on Bookworm hosts
* 09:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P89469 and previous config saved to /var/cache/conftool/dbconfig/20260302-095744-marostegui.json
* 09:57 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru and A:cp - 3.0 upgrade ()
* 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P89468 and previous config saved to /var/cache/conftool/dbconfig/20260302-095627-marostegui.json
* 09:55 fabfur: start upgrading haproxy to 3.0 on A:cp-text_magru ([[phab:T417253|T417253]])
* 09:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89467 and previous config saved to /var/cache/conftool/dbconfig/20260302-094236-marostegui.json
* 09:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P89466 and previous config saved to /var/cache/conftool/dbconfig/20260302-094118-marostegui.json
* 09:35 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:35 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:34 moritzm: installing gnu TLS security updates
* 09:34 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:33 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89465 and previous config saved to /var/cache/conftool/dbconfig/20260302-092610-marostegui.json
* 09:26 mlitn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1245265{{!}}Limit additional whitespace to sticky header version only (T416598)]] (duration: 11m 02s)
* 09:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89464 and previous config saved to /var/cache/conftool/dbconfig/20260302-092600-marostegui.json
* 09:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1178.eqiad.wmnet with reason: Maintenance
* 09:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89463 and previous config saved to /var/cache/conftool/dbconfig/20260302-092535-marostegui.json
* 09:21 mlitn@deploy2002: mlitn: Continuing with sync
* 09:16 mlitn@deploy2002: mlitn: Backport for [[gerrit:1245265{{!}}Limit additional whitespace to sticky header version only (T416598)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:15 mlitn@deploy2002: Started scap sync-world: Backport for [[gerrit:1245265{{!}}Limit additional whitespace to sticky header version only (T416598)]]
* 09:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P89462 and previous config saved to /var/cache/conftool/dbconfig/20260302-091027-marostegui.json
* 09:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2163 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89461 and previous config saved to /var/cache/conftool/dbconfig/20260302-091003-marostegui.json
* 09:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2163.codfw.wmnet with reason: Maintenance
* 09:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89460 and previous config saved to /var/cache/conftool/dbconfig/20260302-090938-marostegui.json
* 09:08 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1246904{{!}}HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477)]] (duration: 16m 09s)
* 09:02 kharlan@deploy2002: kharlan: Continuing with sync
* 08:57 kharlan@deploy2002: kharlan: Backport for [[gerrit:1246904{{!}}HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P89459 and previous config saved to /var/cache/conftool/dbconfig/20260302-085519-marostegui.json
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P89458 and previous config saved to /var/cache/conftool/dbconfig/20260302-085430-marostegui.json
* 08:51 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1246904{{!}}HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477)]]
* 08:48 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru and A:cp - 3.0 upgrade ()
* 08:47 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1029.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:45 moritzm: installing libxml2 security updates
* 08:44 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240672{{!}}Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485)]] (duration: 37m 12s)
* 08:42 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dbproxy1029.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89457 and previous config saved to /var/cache/conftool/dbconfig/20260302-084010-marostegui.json
* 08:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P89456 and previous config saved to /var/cache/conftool/dbconfig/20260302-083922-marostegui.json
* 08:31 kgraessle@deploy2002: kgraessle: Continuing with sync
* 08:30 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1240672{{!}}Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89455 and previous config saved to /var/cache/conftool/dbconfig/20260302-082414-marostegui.json
* 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1177 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89454 and previous config saved to /var/cache/conftool/dbconfig/20260302-082333-marostegui.json
* 08:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89453 and previous config saved to /var/cache/conftool/dbconfig/20260302-082309-marostegui.json
* 08:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbproxy1028.eqiad.wmnet with reason: Maintenance
* 08:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbproxy1029.eqiad.wmnet with reason: Maintenance
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89452 and previous config saved to /var/cache/conftool/dbconfig/20260302-080813-marostegui.json
* 08:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2161.codfw.wmnet with reason: Maintenance
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P89451 and previous config saved to /var/cache/conftool/dbconfig/20260302-080800-marostegui.json
* 08:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89450 and previous config saved to /var/cache/conftool/dbconfig/20260302-080748-marostegui.json
* 08:07 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1240672{{!}}Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485)]]
* 08:05 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru and A:cp - 3.0 upgrade ()
* 08:05 fabfur: start upgrading haproxy to 3.0 on A:cp-upload_magru ([[phab:T417253|T417253]])
* 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P89449 and previous config saved to /var/cache/conftool/dbconfig/20260302-075252-marostegui.json
* 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P89448 and previous config saved to /var/cache/conftool/dbconfig/20260302-075241-marostegui.json
* 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89447 and previous config saved to /var/cache/conftool/dbconfig/20260302-073745-marostegui.json
* 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P89446 and previous config saved to /var/cache/conftool/dbconfig/20260302-073732-marostegui.json
* 07:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89445 and previous config saved to /var/cache/conftool/dbconfig/20260302-072224-marostegui.json
* 07:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89444 and previous config saved to /var/cache/conftool/dbconfig/20260302-072058-marostegui.json
* 07:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 07:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89443 and previous config saved to /var/cache/conftool/dbconfig/20260302-070523-marostegui.json
* 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2154 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89442 and previous config saved to /var/cache/conftool/dbconfig/20260302-070512-marostegui.json
* 07:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2154.codfw.wmnet with reason: Maintenance
* 07:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89441 and previous config saved to /var/cache/conftool/dbconfig/20260302-070447-marostegui.json
* 07:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1244: After schema change
* 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P89439 and previous config saved to /var/cache/conftool/dbconfig/20260302-065014-marostegui.json
* 06:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P89438 and previous config saved to /var/cache/conftool/dbconfig/20260302-064938-marostegui.json
* 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P89436 and previous config saved to /var/cache/conftool/dbconfig/20260302-063506-marostegui.json
* 06:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P89435 and previous config saved to /var/cache/conftool/dbconfig/20260302-063430-marostegui.json
* 06:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89433 and previous config saved to /var/cache/conftool/dbconfig/20260302-061957-marostegui.json
* 06:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89432 and previous config saved to /var/cache/conftool/dbconfig/20260302-061922-marostegui.json
* 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: Maintenance
* 06:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1244: After schema change
* 06:15 marostegui@dns1004: END - running authdns-update
* 06:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2240 [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89430 and previous config saved to /var/cache/conftool/dbconfig/20260302-061428-marostegui.json
* 06:13 marostegui@dns1004: START - running authdns-update
* 06:13 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2179 to s4 primary and set section read-write [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89429 and previous config saved to /var/cache/conftool/dbconfig/20260302-061316-marostegui.json
* 06:12 marostegui@cumin1003: dbctl commit (dc=all): 'Set s4 codfw as read-only for maintenance - [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89428 and previous config saved to /var/cache/conftool/dbconfig/20260302-061252-marostegui.json
* 06:06 marostegui: Starting s4 codfw failover from db2240 to db2179 - [[phab:T418080|T418080]]
* 06:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 42 hosts with reason: Primary switchover s4 [[phab:T418080|T418080]]
* 06:03 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2179 with weight 0 [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89427 and previous config saved to /var/cache/conftool/dbconfig/20260302-060317-marostegui.json
* 06:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89426 and previous config saved to /var/cache/conftool/dbconfig/20260302-060317-marostegui.json
* 06:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 06:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 06:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2152 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89425 and previous config saved to /var/cache/conftool/dbconfig/20260302-060245-marostegui.json
* 06:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2152.codfw.wmnet with reason: Maintenance
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2220.codfw.wmnet with reason: Maintenance
* 02:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 13s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 00:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 00:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89424 and previous config saved to /var/cache/conftool/dbconfig/20260302-004950-marostegui.json
* 00:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P89423 and previous config saved to /var/cache/conftool/dbconfig/20260302-003441-marostegui.json
* 00:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P89422 and previous config saved to /var/cache/conftool/dbconfig/20260302-001933-marostegui.json
* 00:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89421 and previous config saved to /var/cache/conftool/dbconfig/20260302-000425-marostegui.json
* 00:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1253 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89420 and previous config saved to /var/cache/conftool/dbconfig/20260302-000208-marostegui.json
* 00:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1253.eqiad.wmnet with reason: Maintenance
* 00:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89419 and previous config saved to /var/cache/conftool/dbconfig/20260302-000143-marostegui.json
== 2026-03-01 ==
* 23:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P89418 and previous config saved to /var/cache/conftool/dbconfig/20260301-234635-marostegui.json
* 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89417 and previous config saved to /var/cache/conftool/dbconfig/20260301-233524-marostegui.json
* 23:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P89416 and previous config saved to /var/cache/conftool/dbconfig/20260301-233127-marostegui.json
* 23:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P89415 and previous config saved to /var/cache/conftool/dbconfig/20260301-232016-marostegui.json
* 23:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89414 and previous config saved to /var/cache/conftool/dbconfig/20260301-231619-marostegui.json
* 23:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89413 and previous config saved to /var/cache/conftool/dbconfig/20260301-231404-marostegui.json
* 23:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1236.eqiad.wmnet with reason: Maintenance
* 23:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89412 and previous config saved to /var/cache/conftool/dbconfig/20260301-231339-marostegui.json
* 23:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P89411 and previous config saved to /var/cache/conftool/dbconfig/20260301-230508-marostegui.json
* 22:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P89410 and previous config saved to /var/cache/conftool/dbconfig/20260301-225832-marostegui.json
* 22:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89409 and previous config saved to /var/cache/conftool/dbconfig/20260301-224959-marostegui.json
* 22:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2222 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89408 and previous config saved to /var/cache/conftool/dbconfig/20260301-224451-marostegui.json
* 22:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2222.codfw.wmnet with reason: Maintenance
* 22:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89407 and previous config saved to /var/cache/conftool/dbconfig/20260301-224426-marostegui.json
* 22:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P89406 and previous config saved to /var/cache/conftool/dbconfig/20260301-224324-marostegui.json
* 22:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P89405 and previous config saved to /var/cache/conftool/dbconfig/20260301-222919-marostegui.json
* 22:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89404 and previous config saved to /var/cache/conftool/dbconfig/20260301-222815-marostegui.json
* 22:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1231 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89403 and previous config saved to /var/cache/conftool/dbconfig/20260301-222600-marostegui.json
* 22:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1231.eqiad.wmnet with reason: Maintenance
* 22:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89402 and previous config saved to /var/cache/conftool/dbconfig/20260301-222536-marostegui.json
* 22:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P89401 and previous config saved to /var/cache/conftool/dbconfig/20260301-221410-marostegui.json
* 22:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P89400 and previous config saved to /var/cache/conftool/dbconfig/20260301-221027-marostegui.json
* 21:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89399 and previous config saved to /var/cache/conftool/dbconfig/20260301-215902-marostegui.json
* 21:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P89398 and previous config saved to /var/cache/conftool/dbconfig/20260301-215519-marostegui.json
* 21:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89397 and previous config saved to /var/cache/conftool/dbconfig/20260301-215404-marostegui.json
* 21:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2221.codfw.wmnet with reason: Maintenance
* 21:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89396 and previous config saved to /var/cache/conftool/dbconfig/20260301-215339-marostegui.json
* 21:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89395 and previous config saved to /var/cache/conftool/dbconfig/20260301-214011-marostegui.json
* 21:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P89394 and previous config saved to /var/cache/conftool/dbconfig/20260301-213831-marostegui.json
* 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89393 and previous config saved to /var/cache/conftool/dbconfig/20260301-213410-marostegui.json
* 21:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 21:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89392 and previous config saved to /var/cache/conftool/dbconfig/20260301-213346-marostegui.json
* 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P89391 and previous config saved to /var/cache/conftool/dbconfig/20260301-212323-marostegui.json
* 21:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P89390 and previous config saved to /var/cache/conftool/dbconfig/20260301-211837-marostegui.json
* 21:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89389 and previous config saved to /var/cache/conftool/dbconfig/20260301-210815-marostegui.json
* 21:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P89388 and previous config saved to /var/cache/conftool/dbconfig/20260301-210329-marostegui.json
* 21:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89387 and previous config saved to /var/cache/conftool/dbconfig/20260301-210309-marostegui.json
* 21:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2218.codfw.wmnet with reason: Maintenance
* 21:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89386 and previous config saved to /var/cache/conftool/dbconfig/20260301-210244-marostegui.json
* 20:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89385 and previous config saved to /var/cache/conftool/dbconfig/20260301-204820-marostegui.json
* 20:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P89384 and previous config saved to /var/cache/conftool/dbconfig/20260301-204736-marostegui.json
* 20:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1202 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89383 and previous config saved to /var/cache/conftool/dbconfig/20260301-204606-marostegui.json
* 20:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 20:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89382 and previous config saved to /var/cache/conftool/dbconfig/20260301-204541-marostegui.json
* 20:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P89381 and previous config saved to /var/cache/conftool/dbconfig/20260301-203227-marostegui.json
* 20:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P89380 and previous config saved to /var/cache/conftool/dbconfig/20260301-203033-marostegui.json
* 20:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89379 and previous config saved to /var/cache/conftool/dbconfig/20260301-201720-marostegui.json
* 20:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P89378 and previous config saved to /var/cache/conftool/dbconfig/20260301-201525-marostegui.json
* 20:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2208 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89377 and previous config saved to /var/cache/conftool/dbconfig/20260301-201212-marostegui.json
* 20:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Maintenance
* 20:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2200.codfw.wmnet with reason: Maintenance
* 20:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2198.codfw.wmnet with reason: Maintenance
* 20:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89376 and previous config saved to /var/cache/conftool/dbconfig/20260301-200422-marostegui.json
* 20:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89375 and previous config saved to /var/cache/conftool/dbconfig/20260301-200016-marostegui.json
* 19:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1194 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89374 and previous config saved to /var/cache/conftool/dbconfig/20260301-195803-marostegui.json
* 19:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 19:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89373 and previous config saved to /var/cache/conftool/dbconfig/20260301-195738-marostegui.json
* 19:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P89372 and previous config saved to /var/cache/conftool/dbconfig/20260301-194914-marostegui.json
* 19:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P89371 and previous config saved to /var/cache/conftool/dbconfig/20260301-194230-marostegui.json
* 19:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P89370 and previous config saved to /var/cache/conftool/dbconfig/20260301-193406-marostegui.json
* 19:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P89369 and previous config saved to /var/cache/conftool/dbconfig/20260301-192721-marostegui.json
* 19:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89368 and previous config saved to /var/cache/conftool/dbconfig/20260301-191858-marostegui.json
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89367 and previous config saved to /var/cache/conftool/dbconfig/20260301-191340-marostegui.json
* 19:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89366 and previous config saved to /var/cache/conftool/dbconfig/20260301-191315-marostegui.json
* 19:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89365 and previous config saved to /var/cache/conftool/dbconfig/20260301-191213-marostegui.json
* 19:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1191 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89364 and previous config saved to /var/cache/conftool/dbconfig/20260301-190958-marostegui.json
* 19:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 19:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89363 and previous config saved to /var/cache/conftool/dbconfig/20260301-190934-marostegui.json
* 18:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P89362 and previous config saved to /var/cache/conftool/dbconfig/20260301-185807-marostegui.json
* 18:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P89361 and previous config saved to /var/cache/conftool/dbconfig/20260301-185425-marostegui.json
* 18:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P89360 and previous config saved to /var/cache/conftool/dbconfig/20260301-184259-marostegui.json
* 18:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P89359 and previous config saved to /var/cache/conftool/dbconfig/20260301-183917-marostegui.json
* 18:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89358 and previous config saved to /var/cache/conftool/dbconfig/20260301-182750-marostegui.json
* 18:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89357 and previous config saved to /var/cache/conftool/dbconfig/20260301-182409-marostegui.json
* 18:22 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2168 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89356 and previous config saved to /var/cache/conftool/dbconfig/20260301-182238-marostegui.json
* 18:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 18:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89355 and previous config saved to /var/cache/conftool/dbconfig/20260301-182213-marostegui.json
* 18:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89354 and previous config saved to /var/cache/conftool/dbconfig/20260301-182153-marostegui.json
* 18:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 18:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 18:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89353 and previous config saved to /var/cache/conftool/dbconfig/20260301-181818-marostegui.json
* 18:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P89352 and previous config saved to /var/cache/conftool/dbconfig/20260301-180705-marostegui.json
* 18:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P89351 and previous config saved to /var/cache/conftool/dbconfig/20260301-180310-marostegui.json
* 17:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P89350 and previous config saved to /var/cache/conftool/dbconfig/20260301-175157-marostegui.json
* 17:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P89349 and previous config saved to /var/cache/conftool/dbconfig/20260301-174802-marostegui.json
* 17:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89348 and previous config saved to /var/cache/conftool/dbconfig/20260301-173649-marostegui.json
* 17:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89347 and previous config saved to /var/cache/conftool/dbconfig/20260301-173253-marostegui.json
* 17:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89346 and previous config saved to /var/cache/conftool/dbconfig/20260301-173134-marostegui.json
* 17:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 17:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89345 and previous config saved to /var/cache/conftool/dbconfig/20260301-173110-marostegui.json
* 17:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89344 and previous config saved to /var/cache/conftool/dbconfig/20260301-172742-marostegui.json
* 17:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 17:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89343 and previous config saved to /var/cache/conftool/dbconfig/20260301-172717-marostegui.json
* 17:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P89342 and previous config saved to /var/cache/conftool/dbconfig/20260301-171602-marostegui.json
* 17:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P89341 and previous config saved to /var/cache/conftool/dbconfig/20260301-171210-marostegui.json
* 17:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P89340 and previous config saved to /var/cache/conftool/dbconfig/20260301-170053-marostegui.json
* 16:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P89339 and previous config saved to /var/cache/conftool/dbconfig/20260301-165701-marostegui.json
* 16:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89338 and previous config saved to /var/cache/conftool/dbconfig/20260301-164545-marostegui.json
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89337 and previous config saved to /var/cache/conftool/dbconfig/20260301-164153-marostegui.json
* 16:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2150 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89336 and previous config saved to /var/cache/conftool/dbconfig/20260301-164022-marostegui.json
* 16:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2150.codfw.wmnet with reason: Maintenance
* 16:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89335 and previous config saved to /var/cache/conftool/dbconfig/20260301-163938-marostegui.json
* 16:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 16:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 16:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1230.eqiad.wmnet with reason: Maintenance
* 16:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 12:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89334 and previous config saved to /var/cache/conftool/dbconfig/20260301-122201-marostegui.json
* 12:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P89333 and previous config saved to /var/cache/conftool/dbconfig/20260301-120652-marostegui.json
* 11:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P89332 and previous config saved to /var/cache/conftool/dbconfig/20260301-115144-marostegui.json
* 11:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89331 and previous config saved to /var/cache/conftool/dbconfig/20260301-113636-marostegui.json
* 11:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2228 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89330 and previous config saved to /var/cache/conftool/dbconfig/20260301-113156-marostegui.json
* 11:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2228.codfw.wmnet with reason: Maintenance
* 11:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89329 and previous config saved to /var/cache/conftool/dbconfig/20260301-113131-marostegui.json
* 11:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 11:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 11:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1216.eqiad.wmnet with reason: Maintenance
* 11:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89328 and previous config saved to /var/cache/conftool/dbconfig/20260301-111658-marostegui.json
* 11:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P89327 and previous config saved to /var/cache/conftool/dbconfig/20260301-111622-marostegui.json
* 11:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P89326 and previous config saved to /var/cache/conftool/dbconfig/20260301-110151-marostegui.json
* 11:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P89325 and previous config saved to /var/cache/conftool/dbconfig/20260301-110114-marostegui.json
* 10:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P89324 and previous config saved to /var/cache/conftool/dbconfig/20260301-104642-marostegui.json
* 10:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89323 and previous config saved to /var/cache/conftool/dbconfig/20260301-104606-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2223 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89322 and previous config saved to /var/cache/conftool/dbconfig/20260301-104024-marostegui.json
* 10:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2223.codfw.wmnet with reason: Maintenance
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89321 and previous config saved to /var/cache/conftool/dbconfig/20260301-103958-marostegui.json
* 10:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89320 and previous config saved to /var/cache/conftool/dbconfig/20260301-103134-marostegui.json
* 10:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89319 and previous config saved to /var/cache/conftool/dbconfig/20260301-102727-marostegui.json
* 10:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1210.eqiad.wmnet with reason: Maintenance
* 10:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89318 and previous config saved to /var/cache/conftool/dbconfig/20260301-102702-marostegui.json
* 10:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P89317 and previous config saved to /var/cache/conftool/dbconfig/20260301-102450-marostegui.json
* 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P89316 and previous config saved to /var/cache/conftool/dbconfig/20260301-101154-marostegui.json
* 10:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P89315 and previous config saved to /var/cache/conftool/dbconfig/20260301-100942-marostegui.json
* 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P89314 and previous config saved to /var/cache/conftool/dbconfig/20260301-095645-marostegui.json
* 09:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89313 and previous config saved to /var/cache/conftool/dbconfig/20260301-095434-marostegui.json
* 09:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2211 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89312 and previous config saved to /var/cache/conftool/dbconfig/20260301-094847-marostegui.json
* 09:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2211.codfw.wmnet with reason: Maintenance
* 09:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2201.codfw.wmnet with reason: Maintenance
* 09:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89311 and previous config saved to /var/cache/conftool/dbconfig/20260301-094432-marostegui.json
* 09:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89310 and previous config saved to /var/cache/conftool/dbconfig/20260301-094137-marostegui.json
* 09:38 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1207 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89309 and previous config saved to /var/cache/conftool/dbconfig/20260301-093835-marostegui.json
* 09:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1207.eqiad.wmnet with reason: Maintenance
* 09:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89308 and previous config saved to /var/cache/conftool/dbconfig/20260301-093810-marostegui.json
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P89307 and previous config saved to /var/cache/conftool/dbconfig/20260301-092923-marostegui.json
* 09:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P89306 and previous config saved to /var/cache/conftool/dbconfig/20260301-092302-marostegui.json
* 09:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P89305 and previous config saved to /var/cache/conftool/dbconfig/20260301-091415-marostegui.json
* 09:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P89304 and previous config saved to /var/cache/conftool/dbconfig/20260301-090754-marostegui.json
* 08:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89303 and previous config saved to /var/cache/conftool/dbconfig/20260301-085907-marostegui.json
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89302 and previous config saved to /var/cache/conftool/dbconfig/20260301-085427-marostegui.json
* 08:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2192.codfw.wmnet with reason: Maintenance
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89301 and previous config saved to /var/cache/conftool/dbconfig/20260301-085403-marostegui.json
* 08:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89300 and previous config saved to /var/cache/conftool/dbconfig/20260301-085246-marostegui.json
* 08:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1200 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89299 and previous config saved to /var/cache/conftool/dbconfig/20260301-084952-marostegui.json
* 08:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1200.eqiad.wmnet with reason: Maintenance
* 08:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89298 and previous config saved to /var/cache/conftool/dbconfig/20260301-084928-marostegui.json
* 08:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P89297 and previous config saved to /var/cache/conftool/dbconfig/20260301-083855-marostegui.json
* 08:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P89296 and previous config saved to /var/cache/conftool/dbconfig/20260301-083420-marostegui.json
* 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P89295 and previous config saved to /var/cache/conftool/dbconfig/20260301-082346-marostegui.json
* 08:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P89294 and previous config saved to /var/cache/conftool/dbconfig/20260301-081912-marostegui.json
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89293 and previous config saved to /var/cache/conftool/dbconfig/20260301-080838-marostegui.json
* 08:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89292 and previous config saved to /var/cache/conftool/dbconfig/20260301-080404-marostegui.json
* 08:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 08:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89291 and previous config saved to /var/cache/conftool/dbconfig/20260301-080341-marostegui.json
* 08:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1185 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89290 and previous config saved to /var/cache/conftool/dbconfig/20260301-080110-marostegui.json
* 08:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1185.eqiad.wmnet with reason: Maintenance
* 08:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89289 and previous config saved to /var/cache/conftool/dbconfig/20260301-080044-marostegui.json
* 07:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P89288 and previous config saved to /var/cache/conftool/dbconfig/20260301-074833-marostegui.json
* 07:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P89287 and previous config saved to /var/cache/conftool/dbconfig/20260301-074536-marostegui.json
* 07:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P89286 and previous config saved to /var/cache/conftool/dbconfig/20260301-073324-marostegui.json
* 07:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P89285 and previous config saved to /var/cache/conftool/dbconfig/20260301-073028-marostegui.json
* 07:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89284 and previous config saved to /var/cache/conftool/dbconfig/20260301-071816-marostegui.json
* 07:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89283 and previous config saved to /var/cache/conftool/dbconfig/20260301-071521-marostegui.json
* 07:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2171 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89282 and previous config saved to /var/cache/conftool/dbconfig/20260301-071226-marostegui.json
* 07:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 07:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89281 and previous config saved to /var/cache/conftool/dbconfig/20260301-071201-marostegui.json
* 07:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89280 and previous config saved to /var/cache/conftool/dbconfig/20260301-071113-marostegui.json
* 07:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89279 and previous config saved to /var/cache/conftool/dbconfig/20260301-071040-marostegui.json
* 06:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P89278 and previous config saved to /var/cache/conftool/dbconfig/20260301-065653-marostegui.json
* 06:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P89277 and previous config saved to /var/cache/conftool/dbconfig/20260301-065531-marostegui.json
* 06:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P89276 and previous config saved to /var/cache/conftool/dbconfig/20260301-064145-marostegui.json
* 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P89275 and previous config saved to /var/cache/conftool/dbconfig/20260301-064023-marostegui.json
* 06:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89274 and previous config saved to /var/cache/conftool/dbconfig/20260301-062636-marostegui.json
* 06:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89273 and previous config saved to /var/cache/conftool/dbconfig/20260301-062515-marostegui.json
* 06:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89272 and previous config saved to /var/cache/conftool/dbconfig/20260301-062108-marostegui.json
* 06:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1159.eqiad.wmnet with reason: Maintenance
* 06:20 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89271 and previous config saved to /var/cache/conftool/dbconfig/20260301-062047-marostegui.json
* 06:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
* 02:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 00s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
8euz4rv8obnjqvh2lbegirnym7j178w
2396604
2396603
2026-03-28T14:48:03Z
Stashbot
7414
dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: T421398
2396604
wikitext
text/x-wiki
== 2026-03-28 ==
* 14:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 14:16 mutante: releases1003 - re-enabled puppet which was disabled due to [[phab:T418109|T418109]] but should not have been disabled during switch of the deployment server; leading to [[phab:T421532|T421532]]
== 2026-03-27 ==
* 18:11 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 18:00 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:50 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:40 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:39 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:39 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:39 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 17:38 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:37 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 17:37 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:35 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 17:34 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 17:34 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:30 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:30 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:24 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:19 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:15 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:04 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:55 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:50 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:47 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:42 dancy@deploy1003: Finished deploy [releng/jenkins-deploy@31ace7e] (releasing): (no justification provided) (duration: 01m 18s)
* 16:41 dancy@deploy1003: Started deploy [releng/jenkins-deploy@31ace7e] (releasing): (no justification provided)
* 16:37 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:36 jhathaway@cumin2002: END (ERROR) - Cookbook sre.hosts.provision (exit_code=97) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:27 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:22 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:13 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 16:12 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 16:12 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:11 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:10 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 15:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:00 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:09 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:09 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change ips for frack servers - cmooney@cumin1003"
* 14:08 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change ips for frack servers - cmooney@cumin1003"
* 14:02 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 13:52 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:51 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:49 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:49 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:48 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:47 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:47 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:40 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:11 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:08 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:06 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 11:53 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 11:51 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 11:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 11:48 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 11:30 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:27 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:15 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-test1006.eqiad.wmnet with OS trixie
* 11:15 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database abstractwiki ([[phab:T420637|T420637]])
* 11:02 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2002.codfw.wmnet
* 10:54 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2006.codfw.wmnet
* 10:51 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 10:50 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2006.codfw.wmnet
* 10:46 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2005.codfw.wmnet
* 10:43 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2005.codfw.wmnet
* 10:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1006.eqiad.wmnet with reason: host reimage
* 10:27 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1006.eqiad.wmnet with reason: host reimage
* 10:18 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database abstractwiki ([[phab:T420637|T420637]])
* 10:12 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1006.eqiad.wmnet with OS trixie
* 10:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 10:03 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:01 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:58 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:57 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
* 09:37 elukey@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
* 09:06 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 09:05 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:04 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:03 elukey@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 08:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:05 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 08:04 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 08:02 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 07:46 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 03:06 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:32 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:12 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 07s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:30 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 01:30 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul1001.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 01:30 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul2001.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 01:29 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 01:12 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
== 2026-03-26 ==
* 21:35 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1260834{{!}}Add Logstash logging for successful passwordless logins]], [[gerrit:1261511{{!}}InitialiseSettings: Remove apiportalwiki from $wmgCentralAuthAutoLoginWikis (T421413)]] (duration: 06m 58s)
* 21:31 reedy@deploy1003: catrope, reedy: Continuing with sync
* 21:30 reedy@deploy1003: catrope, reedy: Backport for [[gerrit:1260834{{!}}Add Logstash logging for successful passwordless logins]], [[gerrit:1261511{{!}}InitialiseSettings: Remove apiportalwiki from $wmgCentralAuthAutoLoginWikis (T421413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:28 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1260834{{!}}Add Logstash logging for successful passwordless logins]], [[gerrit:1261511{{!}}InitialiseSettings: Remove apiportalwiki from $wmgCentralAuthAutoLoginWikis (T421413)]]
* 21:00 suecarmol@deploy1003: Finished scap sync-world: Backport for [[gerrit:1256498{{!}}PersonalDashboard: Add config for Active Discussions (T420785)]] (duration: 13m 53s)
* 20:54 suecarmol@deploy1003: suecarmol: Continuing with sync
* 20:51 suecarmol@deploy1003: suecarmol: Backport for [[gerrit:1256498{{!}}PersonalDashboard: Add config for Active Discussions (T420785)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:46 suecarmol@deploy1003: Started scap sync-world: Backport for [[gerrit:1256498{{!}}PersonalDashboard: Add config for Active Discussions (T420785)]]
* 20:44 kamila@deploy1003: Finished scap sync-world: Backport for [[gerrit:1261545{{!}}Wrap 'centralauthtoken' in a JWT (T420280)]], [[gerrit:1261470{{!}}Enable $wgTempCategoryCollations for testwiki. (T419274 T419049)]] (duration: 37m 32s)
* 20:30 kamila@deploy1003: matmarex, kamila: Continuing with sync
* 20:25 kamila@deploy1003: matmarex, kamila: Backport for [[gerrit:1261545{{!}}Wrap 'centralauthtoken' in a JWT (T420280)]], [[gerrit:1261470{{!}}Enable $wgTempCategoryCollations for testwiki. (T419274 T419049)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host restbase2039.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:09 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host restbase2039.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>aqs1015.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 20:08 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host restbase2039
* 20:07 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host restbase2039
* 20:06 kamila@deploy1003: Started scap sync-world: Backport for [[gerrit:1261545{{!}}Wrap 'centralauthtoken' in a JWT (T420280)]], [[gerrit:1261470{{!}}Enable $wgTempCategoryCollations for testwiki. (T419274 T419049)]]
* 20:05 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:05 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding restbase2039 to codfw - jhancock@cumin2002"
* 20:05 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding restbase2039 to codfw - jhancock@cumin2002"
* 20:02 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 20:00 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>aqs1015.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:47 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 19:44 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 18:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:48 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:42 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
* 18:42 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/zotero: apply
* 18:42 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
* 18:41 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
* 18:41 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
* 18:40 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/toolhub: apply
* 18:40 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
* 18:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:39 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/termbox: apply
* 18:39 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 18:37 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 18:37 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 18:37 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 18:36 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 18:36 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:36 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 18:36 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
* 18:35 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
* 18:35 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/recommendation-api: apply
* 18:35 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/recommendation-api: apply
* 18:34 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
* 18:34 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
* 18:33 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/push-notifications: apply
* 18:33 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/push-notifications: apply
* 18:32 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 18:31 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 18:31 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 18:28 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 18:28 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:28 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:27 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
* 18:25 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>sessionstore1006.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 18:21 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
* 18:21 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 18:20 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 18:20 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
* 18:19 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/ipoid: apply
* 18:19 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/image-suggestion: apply
* 18:19 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/image-suggestion: apply
* 18:18 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>sessionstore1006.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 18:18 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
* 18:17 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
* 18:17 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
* 18:16 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
* 18:15 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 18:14 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 18:13 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 18:13 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 18:13 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
* 18:12 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
* 18:12 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:12 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
* 18:10 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
* 18:09 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
* 18:09 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/echostore: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/echostore: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 18:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 18:06 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/commons-impact-analytics: apply
* 18:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/commons-impact-analytics: apply
* 18:04 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 18:04 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 18:03 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 18:03 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 18:03 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
* 18:02 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
* 17:59 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/apertium: apply
* 17:58 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/apertium: apply
* 17:55 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to enable envoy drain on remaining services - [[phab:T364245|T364245]] (duration: 05m 31s)
* 17:52 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to enable envoy drain on remaining services - [[phab:T364245|T364245]]
* 17:35 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:39 rzl@deploy1003: Finished scap sync-world: https://gerrit.wikimedia.org/r/1256396 [[phab:T420666|T420666]] (duration: 11m 21s)
* 16:35 rzl@deploy1003: rzl: Continuing with sync
* 16:34 rzl@deploy1003: rzl: https://gerrit.wikimedia.org/r/1256396 [[phab:T420666|T420666]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:31 rzl@deploy1003: Started scap sync-world: https://gerrit.wikimedia.org/r/1256396 [[phab:T420666|T420666]]
* 16:27 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 16:17 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 16:17 blake@deploy1003: Finished scap sync-world: Test deployment to validate deployment server switchover - [[phab:T413974|T413974]] (duration: 31m 09s)
* 16:16 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 16:05 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 15:47 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1202.eqiad.wmnet onto db1253.eqiad.wmnet
* 15:47 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1253: Pool db1253.eqiad.wmnet in after cloning
* 15:46 blake@deploy1003: Started scap sync-world: Test deployment to validate deployment server switchover - [[phab:T413974|T413974]]
* 15:44 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 15:43 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 15:43 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 15:33 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 15:30 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
* 15:30 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
* 15:23 blake@dns1004: END - running authdns-update
* 15:22 bjensen: updating dns for the deployment host switchover
* 15:21 blake@dns1004: START - running authdns-update
* 15:19 blake@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet,releases1003.eqiad.wmnet with reason: Deployment server switchover
* 15:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:01 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1253: Pool db1253.eqiad.wmnet in after cloning
* 14:39 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:28 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:23 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:22 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:21 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:21 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:21 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:20 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 14:20 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 14:20 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 14:19 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 14:18 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 14:18 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 14:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1202: Pool db1202.eqiad.wmnet in after cloning
* 13:57 jynus: dropping ms-backup[12]00[12] grants from backup1-* dbs [[phab:T420464|T420464]]
* 13:56 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1070.eqiad.wmnet
* 13:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1070.eqiad.wmnet
* 13:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1097.eqiad.wmnet
* 13:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1097.eqiad.wmnet
* 13:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1055.eqiad.wmnet
* 13:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1055.eqiad.wmnet
* 13:46 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:45 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:40 sergi0: UTC afternoon backport window done
* 13:39 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259132{{!}}GrowthExperiments: scale edit and thanks query limit to more wikis (T341599)]] (duration: 09m 17s)
* 13:35 sgimeno@deploy2002: sgimeno: Continuing with sync
* 13:32 sgimeno@deploy2002: sgimeno: Backport for [[gerrit:1259132{{!}}GrowthExperiments: scale edit and thanks query limit to more wikis (T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1259132{{!}}GrowthExperiments: scale edit and thanks query limit to more wikis (T341599)]]
* 13:26 jforrester@deploy2002: Finished deploy [integration/docroot@f021d3f]: {{Gerrit|Ia936ecd68e675cff2925dba933e3b67b9bad4cd6}} (duration: 00m 11s)
* 13:26 jforrester@deploy2002: Started deploy [integration/docroot@f021d3f]: {{Gerrit|Ia936ecd68e675cff2925dba933e3b67b9bad4cd6}}
* 13:24 kamila@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256384{{!}}Temporarily add shellbox-icu to $wgShellboxUrls (T419049 T419242 T419274)]] (duration: 07m 16s)
* 13:20 kamila@deploy2002: kamila: Continuing with sync
* 13:19 kamila@deploy2002: kamila: Backport for [[gerrit:1256384{{!}}Temporarily add shellbox-icu to $wgShellboxUrls (T419049 T419242 T419274)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:17 kamila@deploy2002: Started scap sync-world: Backport for [[gerrit:1256384{{!}}Temporarily add shellbox-icu to $wgShellboxUrls (T419049 T419242 T419274)]]
* 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1202: Pool db1202.eqiad.wmnet in after cloning
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 13:13 kamila@deploy2002: Finished scap sync-world: Backport for [[gerrit:1261420{{!}}cswiki: lift IP cap for editathon (T421305)]] (duration: 07m 22s)
* 13:12 btullis@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 13:09 kamila@deploy2002: kamila, anzx: Continuing with sync
* 13:08 jynus: deploying new grants for new ms-backup hosts and removing old ones [[phab:T420464|T420464]]
* 13:08 kamila@deploy2002: kamila, anzx: Backport for [[gerrit:1261420{{!}}cswiki: lift IP cap for editathon (T421305)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:06 kamila@deploy2002: Started scap sync-world: Backport for [[gerrit:1261420{{!}}cswiki: lift IP cap for editathon (T421305)]]
* 13:03 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:43 cdanis: puppet reenabled on drmrs, CIDERGRINDER deployed
* 12:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:23 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:12 cdanis: 💔cdanis@cumin1003.eqiad.wmnet ~ 🕗☕ sudo cumin 'A:cp-drmrs' 'disable-puppet "cdanis CIDER"'
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1004.eqiad.wmnet
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1006.eqiad.wmnet
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1003.eqiad.wmnet
* 12:02 elukey@cumin1003: END (PASS) - Cookbook sre.postgresql.postgres-init (exit_code=0)
* 12:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1002.eqiad.wmnet
* 12:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1005.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1006.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1005.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1004.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1003.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1002.eqiad.wmnet
* 11:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1001.eqiad.wmnet
* 11:44 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:43 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:43 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:41 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:41 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 11:41 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 11:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1001.eqiad.wmnet
* 11:38 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:37 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 11:33 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1202: Depool db1202.eqiad.wmnet to then clone it to db1253.eqiad.wmnet - fceratto@cumin1003
* 11:31 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 11:31 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1202: Depool db1202.eqiad.wmnet to then clone it to db1253.eqiad.wmnet - fceratto@cumin1003
* 11:31 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db1202.eqiad.wmnet onto db1253.eqiad.wmnet
* 11:31 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 11:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1001.eqiad.wmnet
* 11:22 elukey@cumin1003: START - Cookbook sre.postgresql.postgres-init
* 11:22 elukey@cumin1003: END (FAIL) - Cookbook sre.postgresql.postgres-init (exit_code=99)
* 11:22 elukey@cumin1003: START - Cookbook sre.postgresql.postgres-init
* 11:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1001.eqiad.wmnet
* 11:15 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
* 11:14 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
* 11:14 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 11:13 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 11:07 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 11:04 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260802{{!}}SI: Enable on bnwiki, itwiki, simplewiki, and plwiki (T415529)]] (duration: 09m 23s)
* 10:59 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 10:56 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1260802{{!}}SI: Enable on bnwiki, itwiki, simplewiki, and plwiki (T415529)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:54 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1260802{{!}}SI: Enable on bnwiki, itwiki, simplewiki, and plwiki (T415529)]]
* 10:33 oblivian@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: sync
* 10:32 oblivian@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: sync
* 10:32 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 10:32 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:23 tappof@cumin1003: END (PASS) - Cookbook sre.o11y.thanos-compact-restart (exit_code=0)
* 10:23 tappof@cumin1003: START - Cookbook sre.o11y.thanos-compact-restart
* 10:22 tappof@cumin1003: END (PASS) - Cookbook sre.o11y.thanos-compact-restart (exit_code=0)
* 10:22 tappof@cumin1003: START - Cookbook sre.o11y.thanos-compact-restart
* 10:12 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s1
* 10:11 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s1
* 10:05 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s4
* 10:05 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s4
* 09:58 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s8
* 09:58 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s8
* 09:53 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 09:53 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:52 hashar: Starting Gerrit on the replica / gerrit1003
* 09:51 hashar: Stopping Gerrit on the replica / gerrit1003 to clear web sessions
* 09:51 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s7
* 09:50 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s7
* 09:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 09:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 09:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 09:46 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 09:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 09:44 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s3
* 09:43 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s3
* 09:42 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 09:36 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s2
* 09:36 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s2
* 09:31 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:29 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s5
* 09:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:29 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s5
* 09:26 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 09:22 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s6
* 09:22 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 09:22 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s6
* 09:18 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:16 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section es6
* 09:15 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section es6
* 09:13 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 09:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 09:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 09:08 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section x3
* 09:07 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section x3
* 09:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:02 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section x1
* 09:01 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section x1
* 09:01 federico3: starting [[phab:T416708|T416708]] - disabling circular replication on core dbs
* 08:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 08:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 08:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 08:43 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:41 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:32 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 08:27 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 08:18 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 08:11 hashar@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 05:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Downgrade clouddb1022 to 10.11.13
== 2026-03-25 ==
* 23:59 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul2001.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 23:58 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul1001.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 23:29 mutante: zuul1001 - installed mariadb-client - connected once to zuul db on m1-master; mysql> truncate "alembic_version"; - systemctl restart zuul-web - This fixed the zuul-web service. finally no error in systemctl status. ([[phab:T405119|T405119]])
* 21:38 ryankemper: [opensearch-k8s] [[phab:T414484|T414484]] Depooled eqiad; change verified working (now when I do `host k8s-ingress-dse-aa.discovery.wmnet` from `cumin1003`, and then reverse-lookup the resulting IP, I get a codfw address); so traffic is now routing to dse-k8s-codfw
* 21:35 ryankemper@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 21:30 Dreamy_Jazz: Created cusi_case, cusi_user, and cusi_signal on bnwiki, itwiki, simplewiki, plwiki for [[phab:T415529|T415529]]
* 21:27 ryankemper: [opensearch-k8s] [[phab:T414484|T414484]] Getting ready to depool `dnsdisc=k8s-ingress-dse-aa,name=eqiad`, leaving codfw pooled. This will get us ready for a full rolling-upgrade of the dse-k8s-eqiad cluster tomorrow.
* 21:23 Dreamy_Jazz: Evening UTC backport window done
* 21:08 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260797{{!}}SuggestedInvestigations: Import session into signal matching job (T421062)]] (duration: 10m 26s)
* 21:04 kharlan@deploy2002: kharlan: Continuing with sync
* 21:01 kharlan@deploy2002: kharlan: Backport for [[gerrit:1260797{{!}}SuggestedInvestigations: Import session into signal matching job (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:58 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1260797{{!}}SuggestedInvestigations: Import session into signal matching job (T421062)]]
* 20:51 eevans@cumin1003: END (ERROR) - Cookbook sre.cassandra.roll-reboot (exit_code=97) rolling reboot on P<nowiki>{</nowiki>sessionstore[1004-1006].eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 20:43 aaron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259183{{!}}Add Analytics APIs to the RestSandbox (T419429)]] (duration: 08m 33s)
* 20:38 aaron@deploy2002: aaron: Continuing with sync
* 20:36 aaron@deploy2002: aaron: Backport for [[gerrit:1259183{{!}}Add Analytics APIs to the RestSandbox (T419429)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:34 aaron@deploy2002: Started scap sync-world: Backport for [[gerrit:1259183{{!}}Add Analytics APIs to the RestSandbox (T419429)]]
* 20:30 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247073{{!}}Deploy temporary accounts to ruwiki (T413771)]] (duration: 11m 04s)
* 20:25 jdlrobson@deploy2002: stran, jdlrobson: Continuing with sync
* 20:21 jdlrobson@deploy2002: stran, jdlrobson: Backport for [[gerrit:1247073{{!}}Deploy temporary accounts to ruwiki (T413771)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:19 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1247073{{!}}Deploy temporary accounts to ruwiki (T413771)]]
* 20:17 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255775{{!}}Close the legacy-vector dblist (T421289)]] (duration: 07m 42s)
* 20:14 aokoth@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:14 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:13 aokoth@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:13 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:12 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 20:12 aokoth@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:12 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1255775{{!}}Close the legacy-vector dblist (T421289)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255775{{!}}Close the legacy-vector dblist (T421289)]]
* 20:05 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling reboot on P<nowiki>{</nowiki>hcaptcha-proxy7002.wikimedia.org<nowiki>}</nowiki> and A:hcaptcha-proxy
* 20:01 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling reboot on P<nowiki>{</nowiki>hcaptcha-proxy7002.wikimedia.org<nowiki>}</nowiki> and A:hcaptcha-proxy
* 20:00 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>sessionstore[1004-1006].eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:34 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2003.codfw.wmnet
* 19:30 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host pybal-test2003.codfw.wmnet
* 19:26 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 19:24 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>sessionstore[2004-2006].codfw.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
* 19:17 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
* 19:17 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 19:14 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned reboot
* 19:11 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 19:11 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 19:07 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 19:00 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2012.codfw.wmnet
* 18:57 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2012.codfw.wmnet
* 18:53 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>sessionstore[2004-2006].codfw.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 18:51 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
* 18:51 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
* 18:50 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 18:50 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 18:49 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
* 18:49 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
* 18:49 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
* 18:48 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/toolhub: apply
* 18:48 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/termbox: apply
* 18:47 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/termbox: apply
* 18:47 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 18:47 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 18:47 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 18:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 18:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 18:46 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: Planned reboot
* 18:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 18:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 18:45 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 18:45 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 18:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 18:44 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
* 18:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
* 18:44 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/recommendation-api: apply
* 18:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/recommendation-api: apply
* 18:43 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
* 18:43 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 18:43 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/push-notifications: apply
* 18:42 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/push-notifications: apply
* 18:42 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 18:41 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
* 18:41 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:40 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
* 18:40 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
* 18:39 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 18:39 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
* 18:37 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 18:37 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 18:37 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:37 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 18:35 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 18:34 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:34 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:33 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
* 18:29 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:28 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
* 18:28 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 18:26 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 18:26 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
* 18:26 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
* 18:25 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: debug java install
* 18:25 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:25 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases1003.eqiad.wmnet with reason: debug java install
* 18:25 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/image-suggestion: apply
* 18:24 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/image-suggestion: apply
* 18:23 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
* 18:23 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
* 18:23 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
* 18:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
* 18:22 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 18:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 18:21 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 18:21 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 18:20 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 18:20 mutante: releases1003 - apt-get upgrade - envoyproxy, python3-wmflib
* 18:20 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 18:20 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:19 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:19 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 18:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 18:18 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
* 18:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
* 18:17 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
* 18:17 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
* 18:16 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 18:16 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/echostore: apply
* 18:15 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
* 18:15 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
* 18:15 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 18:14 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 18:14 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
* 18:14 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/data-gateway: apply
* 18:14 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 18:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 18:13 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/commons-impact-analytics: apply
* 18:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/commons-impact-analytics: apply
* 18:12 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 18:12 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
* 18:11 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 18:11 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
* 18:11 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
* 18:11 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
* 18:09 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/apertium: apply
* 18:09 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/apertium: apply
* 17:29 sfaci@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:29 sfaci@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 17:23 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: debug java install
* 17:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 16:44 ebysans@deploy2002: Finished deploy [analytics/refinery@80c527b] (thin): Regular analytics weekly train THIN [analytics/refinery@80c527b6] (duration: 01m 59s)
* 16:42 ebysans@deploy2002: Started deploy [analytics/refinery@80c527b] (thin): Regular analytics weekly train THIN [analytics/refinery@80c527b6]
* 16:42 SandraEbele_: Deploying Refinery as part of weekly deployment train
* 16:41 ebysans@deploy2002: Finished deploy [analytics/refinery@80c527b]: Regular analytics weekly train [analytics/refinery@80c527b6] (duration: 04m 32s)
* 16:37 ebysans@deploy2002: Started deploy [analytics/refinery@80c527b]: Regular analytics weekly train [analytics/refinery@80c527b6]
* 16:22 ebysans@deploy2002: Finished deploy [analytics/refinery@80c527b] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@80c527b6] (duration: 01m 58s)
* 16:22 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:21 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:21 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:20 ebysans@deploy2002: Started deploy [analytics/refinery@80c527b] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@80c527b6]
* 16:20 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 16:19 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 16:18 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:18 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:06 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 16:05 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 16:05 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 16:04 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 16:03 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:02 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 16:02 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 16:01 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:51 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:50 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:42 blake@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244628{{!}}debug: reorder debug backends for eqiad switchover (T413974)]] (duration: 07m 41s)
* 15:41 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:37 blake@deploy2002: blake: Continuing with sync
* 15:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:37 blake@deploy2002: blake: Backport for [[gerrit:1244628{{!}}debug: reorder debug backends for eqiad switchover (T413974)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:34 blake@deploy2002: Started scap sync-world: Backport for [[gerrit:1244628{{!}}debug: reorder debug backends for eqiad switchover (T413974)]]
* 15:34 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:32 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-unlock-scap (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:32 root@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Datacenter switchover from codfw to eqiad - (duration: 91m 45s)
* 15:32 root@deploy2002: Forcefully removing global lock: Datacenter switchover from codfw to eqiad -
* 15:32 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-unlock-scap for datacenter switchover from codfw to eqiad
* 15:31 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:26 blake@dns1004: END - running authdns-update
* 15:24 blake@dns1004: START - running authdns-update
* 15:24 elukey@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 15:23 elukey@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 15:18 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters for datacenter switchover from codfw to eqiad
* 15:18 blake@dns1004: END - running authdns-update
* 15:16 blake@dns1004: START - running authdns-update
* 15:14 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-restore-ttl (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:13 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-restore-ttl for datacenter switchover from codfw to eqiad
* 15:11 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:10 root@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 15:09 root@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 15:08 root@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 15:07 root@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 15:07 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance for datacenter switchover from codfw to eqiad
* 15:07 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:07 root@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: sync
* 15:07 root@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: sync
* 15:07 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner for datacenter switchover from codfw to eqiad
* 15:02 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:02 blake@cumin1003: MediaWiki read-only period ends at: 2026-03-25 15:02:52.921926
* 14:55 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:53 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance for datacenter switchover from codfw to eqiad
* 14:52 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:52 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks for datacenter switchover from codfw to eqiad
* 14:51 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:46 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl for datacenter switchover from codfw to eqiad
* 14:28 cdanis: 💙cdanis@apt1002.wikimedia.org ~ 🕥☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
* 14:28 cdanis: 💙cdanis@apt1002.wikimedia.org ~ 🕥☕ sudo -i reprepro --component main --restrict cidergrinder update bullseye-wikimedia
* 14:24 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['phab2002']
* 14:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:17 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['phab2002']
* 14:14 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:13 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:13 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:12 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:11 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:08 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:07 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:07 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:06 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:06 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:06 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:05 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-lock-scap (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:00 root@deploy2002: Locking from deployment [ALL REPOSITORIES]: Datacenter switchover from codfw to eqiad -
* 14:00 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-lock-scap for datacenter switchover from codfw to eqiad
* 13:49 otto@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260091{{!}}EventStreamConfig - Increase spark_job_ingestion_scale for larger event streams (T360794 T351225)]] (duration: 07m 48s)
* 13:45 otto@deploy2002: otto: Continuing with sync
* 13:45 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:44 otto@deploy2002: otto: Backport for [[gerrit:1260091{{!}}EventStreamConfig - Increase spark_job_ingestion_scale for larger event streams (T360794 T351225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 otto@deploy2002: Started scap sync-world: Backport for [[gerrit:1260091{{!}}EventStreamConfig - Increase spark_job_ingestion_scale for larger event streams (T360794 T351225)]]
* 13:32 awight@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260614{{!}}[beta] Kill synthetic refs with feature flag (T421055)]], [[gerrit:1251193{{!}}idwiki: Remove unused user groups on Indonesian Wikipedia (T419105)]], [[gerrit:1251200{{!}}ptwiki: Enable block action for the abuse filter (T419312)]], [[gerrit:1256748{{!}}ptwiki: Add suppressredirect to autoreviewer and rollbacker user groups (T420704)]] (duration: 11m 33s)
* 13:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:27 awight@deploy2002: codenamenoreste, awight, gerrit-patch-uploader: Continuing with sync
* {{safesubst:SAL entry|1=13:23 awight@deploy2002: codenamenoreste, awight, gerrit-patch-uploader: Backport for [[gerrit:1260614{{!}}[beta] Kill synthetic refs with feature flag (T421055)]], [[gerrit:1251193{{!}}idwiki: Remove unused user groups on Indonesian Wikipedia (T419105)]], [[gerrit:1251200{{!}}ptwiki: Enable block action for the abuse filter (T419312)]], [[gerrit:1256748{{!}}ptwiki: Add suppressredirect to autoreviewer and rollbacker user groups (T420704)]}}
* 13:20 awight@deploy2002: Started scap sync-world: Backport for [[gerrit:1260614{{!}}[beta] Kill synthetic refs with feature flag (T421055)]], [[gerrit:1251193{{!}}idwiki: Remove unused user groups on Indonesian Wikipedia (T419105)]], [[gerrit:1251200{{!}}ptwiki: Enable block action for the abuse filter (T419312)]], [[gerrit:1256748{{!}}ptwiki: Add suppressredirect to autoreviewer and rollbacker user groups (T420704)]]
* 13:17 dcausse@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260045{{!}}Revert^2 "search: use the discovery ns record for the semanticsearch cluster"]] (duration: 10m 20s)
* 13:12 dcausse@deploy2002: dcausse: Continuing with sync
* 13:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:09 dcausse@deploy2002: dcausse: Backport for [[gerrit:1260045{{!}}Revert^2 "search: use the discovery ns record for the semanticsearch cluster"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:07 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:06 dcausse@deploy2002: Started scap sync-world: Backport for [[gerrit:1260045{{!}}Revert^2 "search: use the discovery ns record for the semanticsearch cluster"]]
* 13:06 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:02 XioNoX: Inter.Link - DDoS - Activation of automatic reroute
* 12:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:51 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 12:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Downgrade clouddb1022 to 10.11.15
* 12:41 marostegui@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 12:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1002.eqiad.wmnet
* 12:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-test-coord1002.eqiad.wmnet
* 12:38 mszwarc@deploy2002: mwscript-k8s job started: foreachwikiindblist all demoteIneligibleUsers.php --relay-log checkuser=metawiki --relay-log suppress=metawiki # [[phab:T418580|T418580]]
* 12:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-test-coord1002.eqiad.wmnet
* 12:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-mariadb1002.eqiad.wmnet
* 12:33 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:32 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1028.eqiad.wmnet
* 12:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host wdqs1028.eqiad.wmnet
* 12:24 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:19 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260617{{!}}Allow for demoting 2FA-less members of further 6 groups (T418580)]] (duration: 10m 23s)
* 12:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2009.codfw.wmnet
* 12:12 mszwarc@deploy2002: mszwarc: Continuing with sync
* 12:11 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1260617{{!}}Allow for demoting 2FA-less members of further 6 groups (T418580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:09 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1260617{{!}}Allow for demoting 2FA-less members of further 6 groups (T418580)]]
* 12:07 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host wdqs2009.codfw.wmnet
* 12:07 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl2002.codfw.wmnet
* 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl2002.codfw.wmnet
* 11:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl2001.codfw.wmnet
* 11:53 marostegui: Restart clouddb1022:s3 to enable error_log [[phab:T420177|T420177]]
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl2001.codfw.wmnet
* 11:51 jayme: migrated wikikube apiservers (eqiad and codfw) to IPIP - [[phab:T420436|T420436]]
* 11:49 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-master-codfw@codfw
* 11:49 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 11:48 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 11:46 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-master-eqiad@eqiad
* 11:46 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 11:45 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 11:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:43 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-master-codfw@codfw
* 11:41 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-master-eqiad@eqiad
* 11:40 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:38 mvernon@deploy2002: helmfile [codfw] DONE helmfile.d/services/kartotherian: apply
* 11:36 mvernon@deploy2002: helmfile [codfw] START helmfile.d/services/kartotherian: apply
* 11:21 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:18 mvernon@deploy2002: helmfile [codfw] DONE helmfile.d/services/kartotherian: apply
* 11:16 mvernon@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 11:15 mvernon@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 11:15 mvernon@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 11:14 mvernon@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 11:07 mvernon@deploy2002: helmfile [codfw] START helmfile.d/services/kartotherian: apply
* 11:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis abstractwiki in section s5
* 11:07 mvernon@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: apply
* 11:05 mvernon@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: apply
* 10:55 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:53 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis abstractwiki in section s5
* 10:45 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 10:33 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:27 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:26 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:21 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:20 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:20 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:01 elukey@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=kartotherian,name=codfw
* 09:58 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:57 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:52 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:52 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:51 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:51 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:46 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:45 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:45 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:44 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:05 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=aux-k8s-worker200[2-5].codfw.wmnet,cluster=aux-k8s,service=kubesvc
* 09:04 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=aux-k8s-worker200[6-9].codfw.wmnet,cluster=aux-k8s,service=kubesvc
* 09:04 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=aux-k8s-worker100[6-9].eqiad.wmnet,cluster=aux-k8s,service=kubesvc
* 08:55 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker200[6-9].eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:55 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker200[6-9].eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:35 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1009.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:35 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1008.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1007.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1006.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1009.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1008.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1007.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1006.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device fasw2-c8b-codfw
* 08:29 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device fasw2-c8b-codfw
* 08:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device fasw2-c8a-codfw
* 08:29 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device fasw2-c8a-codfw
* 08:10 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 00:33 rzl@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 00:23 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 00:23 rzl@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 00:23 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 00:22 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 00:22 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 00:22 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 00:21 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
* 00:21 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
* 00:21 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/toolhub: apply
* 00:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/termbox: apply
* 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/termbox: apply
* 00:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
* 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
* 00:19 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 00:19 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 00:18 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 00:18 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 00:18 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 00:18 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 00:17 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 00:17 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 00:17 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 00:17 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 00:17 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 00:16 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
* 00:16 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 00:16 rzl@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 00:16 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/recommendation-api: apply
* 00:15 rzl@deploy2002: helmfile [staging] START helmfile.d/services/recommendation-api: apply
* 00:15 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
* 00:15 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
* 00:15 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1023.eqiad.wmnet with OS bookworm
* 00:15 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/push-notifications: apply
* 00:15 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:15 rzl@deploy2002: helmfile [staging] START helmfile.d/services/push-notifications: apply
* 00:15 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
* 00:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:14 rzl@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
* 00:13 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
* 00:13 rzl@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
* 00:13 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 00:13 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 00:13 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 00:12 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 00:12 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 00:11 rzl@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
* 00:10 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 00:10 rzl@deploy2002: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 00:09 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
* 00:07 rzl@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
* 00:07 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 00:06 rzl@deploy2002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 00:06 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
* 00:06 rzl@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: apply
* 00:06 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
* 00:06 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1022.eqiad.wmnet with OS bookworm
* 00:06 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:06 rzl@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
* 00:06 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/image-suggestion: apply
* 00:06 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/image-suggestion: apply
* 00:05 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
* 00:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/geo-analytics: apply
* 00:05 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 00:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 00:04 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 00:04 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 00:04 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 00:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1021.eqiad.wmnet with OS bookworm
* 00:04 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:04 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 00:04 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 00:04 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:03 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 00:03 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 00:03 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 00:02 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 00:02 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 00:02 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
* 00:01 rzl@deploy2002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
* 00:01 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
* 00:01 rzl@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
* 00:00 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 00:00 rzl@deploy2002: helmfile [staging] START helmfile.d/services/echostore: apply
* 00:00 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
* 00:00 rzl@deploy2002: helmfile [staging] START helmfile.d/services/device-analytics: apply
== 2026-03-24 ==
* 23:59 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 23:59 rzl@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 23:59 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 23:59 rzl@deploy2002: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 23:58 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 23:58 rzl@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 23:58 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/commons-impact-analytics: apply
* 23:58 rzl@deploy2002: helmfile [staging] START helmfile.d/services/commons-impact-analytics: apply
* 23:56 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 23:56 rzl@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
* 23:56 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 23:56 rzl@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
* 23:54 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
* 23:53 rzl@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
* 23:53 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1023.eqiad.wmnet with reason: host reimage
* 23:53 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/apertium: apply
* 23:52 rzl@deploy2002: helmfile [staging] START helmfile.d/services/apertium: apply
* 23:46 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1022.eqiad.wmnet with reason: host reimage
* 23:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1021.eqiad.wmnet with reason: host reimage
* 23:19 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1023.eqiad.wmnet with OS bookworm
* 23:19 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1022.eqiad.wmnet with OS bookworm
* 23:18 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1021.eqiad.wmnet with OS bookworm
* 23:16 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
* 23:16 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
* 23:15 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
* 23:15 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
* 22:03 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260118{{!}}Drop inactive simple summary surveys (T389393)]] (duration: 08m 15s)
* 21:57 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:57 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1260118{{!}}Drop inactive simple summary surveys (T389393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1260118{{!}}Drop inactive simple summary surveys (T389393)]]
* 21:52 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259147{{!}}Address FIXME and drop not selector for section headings (T420085)]] (duration: 13m 11s)
* 21:47 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:44 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1259147{{!}}Address FIXME and drop not selector for section headings (T420085)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:38 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1259147{{!}}Address FIXME and drop not selector for section headings (T420085)]]
* 21:00 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=frwiki --source-pseudo-namespace=Abstract_ --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:55 jforrester@deploy2002: mwscript-k8s job started: moveBatch --wiki=frwiki '--u=Jdforrester (WMF)' --r=[[phab:T420654|T420654]] --noredirects /home/jforrester/T420654-frwiki-move # [[phab:T420654|T420654]] abstract: is now an interwiki; manual fix
* 20:55 jforrester@deploy2002: mwscript-k8s job started: moveBatch '--u=Jdforrester (WMF)' --r=[[phab:T420654|T420654]] --noredirects /home/jforrester/T420654-frwiki-move # [[phab:T420654|T420654]] abstract: is now an interwiki; manual fix
* 20:47 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=ptwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:46 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=idwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:46 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=frwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:45 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=eswiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:39 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=enwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:39 jforrester@deploy2002: mwscript-k8s job started: sql extensions/WikimediaMaintenance/maintenance/namespaceDupes.php --wiki=enwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:38 jforrester@deploy2002: mwscript-k8s job started: sql maintenance/namespaceDupes.php --wiki=enwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:38 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256433{{!}}[wikifunctions] Drop m.wikifunctions.org from lists, we've not used it for years]], [[gerrit:1250114{{!}}Move GrowthExperiments REST API definition to IS]], [[gerrit:1259993{{!}}dumpInterwiki: Re-generate to add Abstract Wikipedia (and others) (T420654)]] (duration: 07m 46s)
* 20:33 jforrester@deploy2002: jforrester: Continuing with sync
* 20:32 jforrester@deploy2002: jforrester: Backport for [[gerrit:1256433{{!}}[wikifunctions] Drop m.wikifunctions.org from lists, we've not used it for years]], [[gerrit:1250114{{!}}Move GrowthExperiments REST API definition to IS]], [[gerrit:1259993{{!}}dumpInterwiki: Re-generate to add Abstract Wikipedia (and others) (T420654)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified the
* 20:30 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1256433{{!}}[wikifunctions] Drop m.wikifunctions.org from lists, we've not used it for years]], [[gerrit:1250114{{!}}Move GrowthExperiments REST API definition to IS]], [[gerrit:1259993{{!}}dumpInterwiki: Re-generate to add Abstract Wikipedia (and others) (T420654)]]
* {{safesubst:SAL entry|1=20:27 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259967{{!}}Set json object before setting Abstract Wiki Id (T420916)]], [[gerrit:1259994{{!}}AbstractPreview: apply selected preview language lang/dir to abstract preview body (T420687)]], [[gerrit:1260092{{!}}AbstractTitle: Handle pageinfo responses without normalized titles (T420725)]], [[gerrit:1259992{{!}}[abstractwiki] Don't list abstract as a langlist entry}}
* 20:22 jforrester@deploy2002: jforrester: Continuing with sync
* 20:22 jforrester@deploy2002: jforrester: Backport for [[gerrit:1259967{{!}}Set json object before setting Abstract Wiki Id (T420916)]], [[gerrit:1259994{{!}}AbstractPreview: apply selected preview language lang/dir to abstract preview body (T420687)]], [[gerrit:1260092{{!}}AbstractTitle: Handle pageinfo responses without normalized titles (T420725)]], [[gerrit:1259992{{!}}[abstractwiki] Don't list abstract as a langlist entry (T420654)]] s
* {{safesubst:SAL entry|1=20:20 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1259967{{!}}Set json object before setting Abstract Wiki Id (T420916)]], [[gerrit:1259994{{!}}AbstractPreview: apply selected preview language lang/dir to abstract preview body (T420687)]], [[gerrit:1260092{{!}}AbstractTitle: Handle pageinfo responses without normalized titles (T420725)]], [[gerrit:1259992{{!}}[abstractwiki] Don't list abstract as a langlist entry}}
* 20:12 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1242542{{!}}Generate our own logo thumbnails rather than using MediaWiki's (T414048)]], [[gerrit:1250095{{!}}Enwikinews: Only enable flaggedRevs in article namespace (T418066)]], [[gerrit:1252684{{!}}Disable magic links on afwiki (T420142)]] (duration: 09m 22s)
* 20:08 jforrester@deploy2002: jforrester, pppery: Continuing with sync
* 20:05 jforrester@deploy2002: jforrester, pppery: Backport for [[gerrit:1242542{{!}}Generate our own logo thumbnails rather than using MediaWiki's (T414048)]], [[gerrit:1250095{{!}}Enwikinews: Only enable flaggedRevs in article namespace (T418066)]], [[gerrit:1252684{{!}}Disable magic links on afwiki (T420142)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:03 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1242542{{!}}Generate our own logo thumbnails rather than using MediaWiki's (T414048)]], [[gerrit:1250095{{!}}Enwikinews: Only enable flaggedRevs in article namespace (T418066)]], [[gerrit:1252684{{!}}Disable magic links on afwiki (T420142)]]
* 19:42 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:42 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:41 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:39 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1258300{{!}}tests: Make many things static for PHPUnit 10 (T420844)]], [[gerrit:1258301{{!}}phpunit.xml: Update configuration for PHPUnit 10 (T420844)]] (duration: 07m 21s)
* 19:35 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:35 reedy@deploy2002: reedy: Continuing with sync
* 19:34 reedy@deploy2002: reedy: Backport for [[gerrit:1258300{{!}}tests: Make many things static for PHPUnit 10 (T420844)]], [[gerrit:1258301{{!}}phpunit.xml: Update configuration for PHPUnit 10 (T420844)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:32 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1258300{{!}}tests: Make many things static for PHPUnit 10 (T420844)]], [[gerrit:1258301{{!}}phpunit.xml: Update configuration for PHPUnit 10 (T420844)]]
* 19:02 inflatador: bking@apt1002 `sudo -E reprepro -C component/opensearch2 include trixie-wikimedia ~/wmf-opensearch-search-plugins-2.19.5+3-trixie/wmf-opensearch-search-plugins_2.19.5+3_amd64.changes`
* 18:48 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 18:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1170: Degraded drive replaced [[phab:T420873|T420873]]
* 18:43 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 18:36 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 18:35 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 18:25 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:24 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:20 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 18:20 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 18:13 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 18:11 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 18:07 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on phab2002.codfw.wmnet with reason: [[phab:T420228|T420228]]
* 18:01 aokoth@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 18:01 aokoth@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 18:01 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:00 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:00 mutante: codesearch9.codesearch - systemctl restart hound_proxy ([[phab:T421147|T421147]])
* 17:34 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 17:30 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 17:20 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 17:20 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 17:20 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 17:20 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 17:00 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:00 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:00 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:00 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 16:47 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 16:38 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1113.*
* 16:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1170: Degraded drive replaced [[phab:T420873|T420873]]
* 16:24 sfaci@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:24 sfaci@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:22 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1113.eqiad.wmnet with OS trixie
* 16:05 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 16:04 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 16:03 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 16:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 16:03 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 16:03 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 16:03 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 16:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1113.eqiad.wmnet with reason: host reimage
* 15:54 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1113.eqiad.wmnet with reason: host reimage
* 15:54 bjensen: Services portion of the datacenter switchover is complete
* 15:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2009.codfw.wmnet with OS trixie
* 15:46 blake@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) depool all services in codfw: Datacenter Switchover - [[phab:T413974|T413974]]
* 15:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:38 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1113.eqiad.wmnet with OS trixie
* 15:38 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1113.eqiad.wmnet with OS trixie
* 15:36 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2009.codfw.wmnet with reason: host reimage
* 15:30 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2009.codfw.wmnet with reason: host reimage
* 15:20 blake@cumin1003: START - Cookbook sre.discovery.datacenter depool all services in codfw: Datacenter Switchover - [[phab:T413974|T413974]]
* 15:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1113.eqiad.wmnet with OS trixie
* 15:18 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2009.codfw.wmnet with OS trixie
* 15:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2008.codfw.wmnet with OS trixie
* 14:59 blake@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool codfw [reason: no reason specified, no task ID specified]
* 14:59 blake@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool codfw [reason: no reason specified, no task ID specified]
* 14:59 bjensen: beginning the Traffic and Services portions of the DC switchover, operational followup will be in #wikimedia-sre
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2008.codfw.wmnet with reason: host reimage
* 14:56 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2008.codfw.wmnet with reason: host reimage
* 14:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1009.eqiad.wmnet with OS trixie
* 14:44 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2008.codfw.wmnet with OS trixie
* 14:42 aokoth@dns1004: END - running authdns-update
* 14:41 aokoth@dns1004: START - running authdns-update
* 14:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1009.eqiad.wmnet with reason: host reimage
* 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2009.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:27 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1009.eqiad.wmnet with reason: host reimage
* 14:26 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2009.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:25 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:23 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:20 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2008.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:19 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:19 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:16 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1009.eqiad.wmnet with OS trixie
* 14:15 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2008.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:14 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 14:13 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 14:13 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 14:13 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 14:12 dcausse@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259979{{!}}Revert "search: use the discovery ns record for the semanticsearch cluster"]] (duration: 06m 54s)
* 14:08 dcausse@deploy2002: dcausse: Continuing with sync
* 14:07 dcausse@deploy2002: dcausse: Backport for [[gerrit:1259979{{!}}Revert "search: use the discovery ns record for the semanticsearch cluster"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2007.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:05 dcausse@deploy2002: Started scap sync-world: Backport for [[gerrit:1259979{{!}}Revert "search: use the discovery ns record for the semanticsearch cluster"]]
* 14:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1008.eqiad.wmnet with OS trixie
* 14:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2007.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:59 jforrester@deploy2002: mwscript-k8s job started: sql --wiki=abstractwiki /srv/mediawiki/php-1.46.0-wmf.20/extensions/Translate/sql/mysql/translate_message_group_subscriptions.sql # [[phab:T420656|T420656]] translate_message_group_subscriptions
* 13:59 dcausse@deploy2002: Sync cancelled.
* 13:57 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:52 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1008.eqiad.wmnet with reason: host reimage
* 13:46 dcausse@deploy2002: dcausse: Backport for [[gerrit:1259875{{!}}search: use the discovery ns record for the semanticsearch cluster (T414484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:44 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1008.eqiad.wmnet with reason: host reimage
* 13:44 dcausse@deploy2002: Started scap sync-world: Backport for [[gerrit:1259875{{!}}search: use the discovery ns record for the semanticsearch cluster (T414484)]]
* 13:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1008.eqiad.wmnet with OS trixie
* 13:32 sukhe: sudo cumin -b1 -s20 'C:bird' "run-puppet-agent --enable 'merging CR {{Gerrit|1248385}}, [[phab:T413740|T413740]]'"
* 13:30 cmelo@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259231{{!}}Enable the CampaignEvents extension on all wikibooks (T419597)]], [[gerrit:1259237{{!}}Enable $wgCampaignEventsEnableEventGoals in prod wikis (T414149)]] (duration: 12m 43s)
* 13:26 cmelo@deploy2002: cmelo, daimona: Continuing with sync
* 13:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1007.eqiad.wmnet with OS trixie
* 13:23 sukhe: sudo cumin 'C:bird' "disable-puppet 'merging CR {{Gerrit|1248385}}, [[phab:T413740|T413740]]'"
* 13:20 cmelo@deploy2002: cmelo, daimona: Backport for [[gerrit:1259231{{!}}Enable the CampaignEvents extension on all wikibooks (T419597)]], [[gerrit:1259237{{!}}Enable $wgCampaignEventsEnableEventGoals in prod wikis (T414149)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:18 cmelo@deploy2002: Started scap sync-world: Backport for [[gerrit:1259231{{!}}Enable the CampaignEvents extension on all wikibooks (T419597)]], [[gerrit:1259237{{!}}Enable $wgCampaignEventsEnableEventGoals in prod wikis (T414149)]]
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 13:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) payments1012.frack.eqiad.wmnet on all recursors
* 13:04 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache payments1012.frack.eqiad.wmnet on all recursors
* 13:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) payments1011.frack.eqiad.wmnet on all recursors
* 13:03 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache payments1011.frack.eqiad.wmnet on all recursors
* 13:03 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) payments1010.frack.eqiad.wmnet on all recursors
* 13:03 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache payments1010.frack.eqiad.wmnet on all recursors
* 13:02 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 13:00 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:00 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify records for payments servers frack - cmooney@cumin1003"
* 13:00 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify records for payments servers frack - cmooney@cumin1003"
* 12:56 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 12:50 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1007.eqiad.wmnet with OS trixie
* 12:02 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1017.eqiad.wmnet
* 12:02 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1017.eqiad.wmnet
* 12:01 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s1
* 11:53 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:53 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:51 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017 [[phab:T419960|T419960]]
* 11:51 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1017.eqiad.wmnet
* 11:51 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1017.eqiad.wmnet
* 11:51 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s1
* 11:49 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:49 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:46 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:46 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 11:36 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin1001.eqiad.wmnet
* 11:32 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet,service=x3
* 11:32 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet,service=x3
* 11:32 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcumin1001.eqiad.wmnet
* 11:31 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1023.eqiad.wmnet,service=x3
* 11:31 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=x3
* 11:31 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1023.eqiad.wmnet,service=x3
* 11:31 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1022.eqiad.wmnet,service=x3
* 11:27 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s3
* 11:27 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:27 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:26 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1006.eqiad.wmnet with reason: host reimage
* 11:22 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:19 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:19 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:18 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1006.eqiad.wmnet with reason: host reimage
* 11:18 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s3
* 11:17 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s3
* 11:14 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:14 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:07 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 10:55 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:55 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:53 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2007.codfw.wmnet with OS trixie
* 10:49 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2006.codfw.wmnet with OS trixie
* 10:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:36 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2007.codfw.wmnet with reason: host reimage
* 10:33 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2006.codfw.wmnet with reason: host reimage
* 10:30 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2007.codfw.wmnet with reason: host reimage
* 10:28 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2006.codfw.wmnet with reason: host reimage
* 10:22 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:21 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:20 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:20 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:17 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 10:17 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2007.codfw.wmnet with OS trixie
* 10:16 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2006.codfw.wmnet with OS trixie
* 10:07 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 09:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:46 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:43 ayounsi@cumin1003: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:34 ayounsi@cumin1003: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:31 ayounsi@cumin1003: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:31 ayounsi@cumin1003: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:29 ayounsi@cumin1003: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:29 ayounsi@cumin1003: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:23 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4008.ulsfo.wmnet with OS bookworm
* 09:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 09:01 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 08:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 08:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old ulsfo ganeti VIP - ayounsi@cumin1003"
* 08:50 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old ulsfo ganeti VIP - ayounsi@cumin1003"
* 08:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:46 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1170: Degraded drive [[phab:T420873|T420873]]
* 08:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:45 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1170: Degraded drive [[phab:T420873|T420873]]
* 08:43 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:39 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti4008.ulsfo.wmnet with OS bookworm
* 08:31 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 08:31 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:27 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:27 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:25 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:24 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:13 hashar@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 07:59 hashar: Changed https://logstash.wikimedia.org/ default page back to /app/dashboards
* 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.18 (duration: 01m 13s)
* 03:42 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.21 refs [[phab:T420479|T420479]] (duration: 39m 27s)
* 03:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 02:46 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 04s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:56 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 01:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1104.*
* 01:37 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1104.eqiad.wmnet with OS trixie
* 01:14 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1104.eqiad.wmnet with reason: host reimage
* 01:08 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1104.eqiad.wmnet with reason: host reimage
* 00:52 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1104.eqiad.wmnet with OS trixie
* 00:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 00:18 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1115.eqiad.wmnet with OS trixie
== 2026-03-23 ==
* 22:51 rzl: root@apt1002:~# reprepro --noskipold --restrict vopsbot update bookworm-wikimedia
* 22:44 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 22:28 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host an-worker1172.eqiad.wmnet
* 22:25 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1104.eqiad.wmnet with OS trixie
* 22:07 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 22:05 maryum: Deployed security fix for [[phab:T415584|T415584]]
* 21:53 maryum: Deployed security fix for [[phab:T419192|T419192]]
* 21:41 maryum: Deployed security fix for [[phab:T419168|T419168]]
* 21:35 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 21:25 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255847{{!}}testwiki: Add temporary groups for security testing]] (duration: 12m 33s)
* 21:22 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 21:21 catrope@deploy2002: catrope: Continuing with sync
* 21:18 catrope@deploy2002: catrope: Backport for [[gerrit:1255847{{!}}testwiki: Add temporary groups for security testing]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:12 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1255847{{!}}testwiki: Add temporary groups for security testing]]
* 21:05 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1106.eqiad.wmnet [reason: trixie reimaging]
* 21:05 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1106.eqiad.wmnet [reason: trixie reimaging]
* 21:05 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1104.eqiad.wmnet with OS trixie
* 21:04 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1104.eqiad.wmnet [reason: trixie reimaging]
* 21:03 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1103.eqiad.wmnet [reason: trixie reimaging]
* 20:58 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256394{{!}}Abstract Wikipedia: Fix API call to get page info (T420725)]], [[gerrit:1259085{{!}}[abstractwiki] Enable the Translate extension (T420656)]], [[gerrit:1250113{{!}}Move testwiki-only Attribution REST API definition to IS]] (duration: 11m 12s)
* 20:54 jforrester@deploy2002: jforrester: Continuing with sync
* 20:53 jforrester@deploy2002: jforrester: Backport for [[gerrit:1256394{{!}}Abstract Wikipedia: Fix API call to get page info (T420725)]], [[gerrit:1259085{{!}}[abstractwiki] Enable the Translate extension (T420656)]], [[gerrit:1250113{{!}}Move testwiki-only Attribution REST API definition to IS]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1103.eqiad.wmnet with OS trixie
* 20:50 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy4002.wikimedia.org
* 20:50 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:50 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:47 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1256394{{!}}Abstract Wikipedia: Fix API call to get page info (T420725)]], [[gerrit:1259085{{!}}[abstractwiki] Enable the Translate extension (T420656)]], [[gerrit:1250113{{!}}Move testwiki-only Attribution REST API definition to IS]]
* 20:46 sukhe@cumin1003: START - Cookbook sre.dns.netbox
* 20:45 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 20:43 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1102.eqiad.wmnet [reason: trixie reimaging]
* {{safesubst:SAL entry|1=20:42 dani@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254448{{!}}Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450{{!}}Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452{{!}}Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763{{!}}testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120{{!}}Enable wgCampaignEventsEnableEventGoals in}}
* 20:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1102.eqiad.wmnet with OS trixie
* 20:41 sukhe@cumin1003: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy4002.wikimedia.org
* 20:40 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy4001.wikimedia.org
* 20:40 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:40 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:39 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:37 dani@deploy2002: milimetric, daimona, dani: Continuing with sync
* {{safesubst:SAL entry|1=20:36 dani@deploy2002: milimetric, daimona, dani: Backport for [[gerrit:1254448{{!}}Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450{{!}}Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452{{!}}Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763{{!}}testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120{{!}}Enable wgCampaignEventsEnableEventGoals i}}
* 20:35 sukhe@cumin1003: START - Cookbook sre.dns.netbox
* {{safesubst:SAL entry|1=20:34 dani@deploy2002: Started scap sync-world: Backport for [[gerrit:1254448{{!}}Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450{{!}}Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452{{!}}Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763{{!}}testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120{{!}}Enable wgCampaignEventsEnableEventGoals in}}
* 20:31 sukhe@cumin1003: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy4001.wikimedia.org
* 20:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1103.eqiad.wmnet with reason: host reimage
* 20:24 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1103.eqiad.wmnet with reason: host reimage
* 20:23 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 20:19 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1102.eqiad.wmnet with reason: host reimage
* 20:17 alexsanford@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256472{{!}}Reduce reauth timeout for editing site JS to 10 minutes (T419605)]] (duration: 07m 32s)
* 20:14 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1102.eqiad.wmnet with reason: host reimage
* 20:13 alexsanford@deploy2002: alexsanford: Continuing with sync
* 20:11 alexsanford@deploy2002: alexsanford: Backport for [[gerrit:1256472{{!}}Reduce reauth timeout for editing site JS to 10 minutes (T419605)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 alexsanford@deploy2002: Started scap sync-world: Backport for [[gerrit:1256472{{!}}Reduce reauth timeout for editing site JS to 10 minutes (T419605)]]
* 20:08 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1103.eqiad.wmnet with OS trixie
* 20:07 alexsanford: Deployed mitigation for [[phab:T419605|T419605]]
* 19:58 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 19:58 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 19:58 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 19:58 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1102.eqiad.wmnet with OS trixie
* 19:57 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 19:54 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 19:54 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 19:54 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 19:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy4004.wikimedia.org
* 19:51 cdobbins@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1102.eqiad.wmnet with OS trixie
* 19:50 cdobbins@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1103.eqiad.wmnet with OS trixie
* 19:50 ayounsi@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy4004.wikimedia.org
* 19:47 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 19:47 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy4003.wikimedia.org
* 19:46 ayounsi@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy4003.wikimedia.org
* 19:44 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 19:44 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>aqs[1011,1014,1016-1022]*<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:42 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1103.eqiad.wmnet with OS trixie
* 19:42 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1103.eqiad.wmnet [reason: trixie reimaging]
* 19:41 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1101.eqiad.wmnet [reason: trixie reimaging]
* 19:41 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1102.eqiad.wmnet with OS trixie
* 19:41 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1102.eqiad.wmnet [reason: trixie reimaging]
* 19:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1101.eqiad.wmnet with OS trixie
* 19:39 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1100.eqiad.wmnet [reason: trixie reimaging]
* 19:37 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1100.eqiad.wmnet with OS trixie
* 19:30 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 19:18 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1101.eqiad.wmnet with reason: host reimage
* 19:14 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1100.eqiad.wmnet with reason: host reimage
* 19:13 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1101.eqiad.wmnet with reason: host reimage
* 19:13 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:13 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:10 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1100.eqiad.wmnet with reason: host reimage
* 18:59 inflatador: bking@deploy2002 restarting opensearch-semantic-search eqiad to renew certs
* 18:57 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1101.eqiad.wmnet with OS trixie
* 18:55 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1101.eqiad.wmnet with OS trixie
* 18:54 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1100.eqiad.wmnet with OS trixie
* 18:53 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1100.eqiad.wmnet with OS trixie
* 18:50 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 18:49 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 18:36 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on hcaptcha-proxy4002.wikimedia.org with reason: depooled host (soon to be decomed)
* 18:35 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on hcaptcha-proxy4001.wikimedia.org with reason: depooled host (soon to be decomed)
* 18:10 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 18:10 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 18:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>aqs[1011,1014,1016-1022]*<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 17:54 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 17:54 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1115.eqiad.wmnet with OS trixie
* 17:53 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase-eqiad
* 17:49 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259136{{!}}EventStreamConfig: Document not adding performer attributes to SI interaction v2 stream (T418740)]] (duration: 06m 28s)
* 17:45 dreamyjazz@deploy2002: kharlan, dreamyjazz: Continuing with sync
* 17:45 dreamyjazz@deploy2002: kharlan, dreamyjazz: Backport for [[gerrit:1259136{{!}}EventStreamConfig: Document not adding performer attributes to SI interaction v2 stream (T418740)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:43 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1259136{{!}}EventStreamConfig: Document not adding performer attributes to SI interaction v2 stream (T418740)]]
* 17:35 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:34 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1101.eqiad.wmnet with OS trixie
* 17:34 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1101.eqiad.wmnet [reason: trixie reimaging]
* 17:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:31 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1100.eqiad.wmnet with OS trixie
* 17:30 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1100.eqiad.wmnet [reason: trixie reimaging]
* 17:27 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:26 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:24 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:23 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:22 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:21 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 17:21 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:20 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 17:20 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:18 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:17 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:17 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:16 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 17:13 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:13 bd808@deploy2002: Finished deploy [releng/jenkins-deploy@f47af21] (releasing): jobs: Use TZ=UTC in branchMWSingleVersion.groovy trigger ([[phab:T404399|T404399]]) (duration: 01m 36s)
* 17:12 bd808@deploy2002: Started deploy [releng/jenkins-deploy@f47af21] (releasing): jobs: Use TZ=UTC in branchMWSingleVersion.groovy trigger ([[phab:T404399|T404399]])
* 17:12 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:09 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:08 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:08 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:08 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:07 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:06 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:04 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:04 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:03 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:02 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 17:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1001.eqiad.wmnet
* 16:56 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:56 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 14 hosts
* 16:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:55 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for 14 hosts
* 16:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1001.eqiad.wmnet
* 16:53 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:52 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:52 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:46 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:41 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:38 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin2001.codfw.wmnet
* 16:35 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 16:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:34 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin2001.codfw.wmnet
* 16:32 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 16:31 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:30 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:29 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1023.eqiad.wmnet
* 16:29 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1023.eqiad.wmnet
* 16:28 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:27 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:24 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1010.eqiad.wmnet
* 16:24 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1010.eqiad.wmnet
* 16:21 jgreen@dns1004: END - running authdns-update
* 16:19 jgreen@dns1004: START - running authdns-update
* 16:18 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:17 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Rebooting clouddb1023 [[phab:T419960|T419960]]
* 16:09 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1025.eqiad.wmnet
* 16:09 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1025.eqiad.wmnet
* 16:09 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1025.eqiad.wmnet
* 16:04 urandom: stopping aqs1010 for SSD replacement — [[phab:T420867|T420867]]
* 16:03 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:03 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on aqs1010.eqiad.wmnet with reason: Shutting down for SSD replacement — [[phab:T420867|T420867]]
* 15:58 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1025.eqiad.wmnet
* 15:57 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1025.eqiad.wmnet with reason: Rebooting clouddb1025 [[phab:T419960|T419960]]
* 15:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:53 topranks: disabling puppet for nftables-enabled machines to validate new ruleset on selected hosts before wider rollout [[phab:T420715|T420715]]
* 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:50 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:49 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:31 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1172.eqiad.wmnet
* 15:21 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 15:20 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 15:15 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1020.eqiad.wmnet
* 15:14 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1020.eqiad.wmnet
* 15:14 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet
* 15:05 btullis@cumin1003: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1172.eqiad.wmnet
* 15:03 btullis@cumin1003: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1172.eqiad.wmnet
* 15:03 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) opensearch-ipoid.discovery.wmnet on all recursors
* 15:03 bking@cumin2002: START - Cookbook sre.dns.wipe-cache opensearch-ipoid.discovery.wmnet on all recursors
* 15:03 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 15:01 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 15:01 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) opensearch-ipoid.discovery.wmnet on all recursors
* 14:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache opensearch-ipoid.discovery.wmnet on all recursors
* 14:58 sukhe@dns1004: END - running authdns-update
* 14:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) opensearch-test.discovery.wmnet on all recursors
* 14:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache opensearch-test.discovery.wmnet on all recursors
* 14:57 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet
* 14:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:56 sukhe@dns1004: START - running authdns-update
* 14:56 sukhe@dns1004: END - running authdns-update
* 14:56 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1020.eqiad.wmnet with reason: Rebooting clouddb1020 [[phab:T419960|T419960]]
* 14:56 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1019.eqiad.wmnet
* 14:56 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1019.eqiad.wmnet
* 14:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:55 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1019.eqiad.wmnet
* 14:55 sukhe@dns1004: START - running authdns-update
* 14:55 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase-eqiad
* 14:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:49 sukhe@dns1004: END - running authdns-update
* 14:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:48 sukhe@dns1004: START - running authdns-update
* 14:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.eqiad.wmnet
* 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 14:44 sukhe@dns1004: END - running authdns-update
* 14:43 sukhe@dns1004: START - running authdns-update
* 14:40 sukhe@dns1004: FAIL - running authdns-update
* 14:39 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.eqiad.wmnet
* 14:38 sukhe@dns1004: START - running authdns-update
* 14:37 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 14:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) k8s-ingress-dse-aa.discovery.wmnet on all recursors
* 14:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache k8s-ingress-dse-aa.discovery.wmnet on all recursors
* 14:34 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1019.eqiad.wmnet
* 14:34 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1019.eqiad.wmnet with reason: Rebooting clouddb1019 [[phab:T419960|T419960]]
* 14:33 sukhe@dns1004: FAIL - running authdns-update
* 14:33 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1002.eqiad.wmnet
* 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1018.eqiad.wmnet
* 14:33 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1018.eqiad.wmnet
* 14:32 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet
* 14:32 sukhe@dns1004: START - running authdns-update
* 14:31 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=codfw
* 14:30 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1172.eqiad.wmnet with OS bullseye
* 14:30 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:27 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1002.eqiad.wmnet
* 14:22 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1018.eqiad.wmnet with reason: Rebooting clouddb1018 [[phab:T419960|T419960]]
* 14:22 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1018.eqiad.wmnet
* 14:22 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1018.eqiad.wmnet
* 14:21 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet
* 14:20 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:17 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1001.eqiad.wmnet
* 14:14 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 14:14 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on db1253.eqiad.wmnet with reason: Under repair
* 14:11 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1001.eqiad.wmnet
* 14:07 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:04 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha2002.wikimedia.org
* 14:04 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:03 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2347-2351].codfw.wmnet
* 14:03 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2347-2351].codfw.wmnet
* 14:00 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha2002.wikimedia.org
* 14:00 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha2001.wikimedia.org
* 13:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1172.eqiad.wmnet with reason: host reimage
* 13:57 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2347-2351].codfw.wmnet
* 13:56 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha2001.wikimedia.org
* 13:56 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:55 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha1002.wikimedia.org
* 13:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1172.eqiad.wmnet with reason: host reimage
* 13:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2347-2351].codfw.wmnet
* 13:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:51 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha1002.wikimedia.org
* 13:51 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha1001.wikimedia.org
* 13:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:47 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha1001.wikimedia.org
* 13:47 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:43 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:43 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:43 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:42 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host an-worker1172.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:42 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 13:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2012.codfw.wmnet
* 13:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2011.codfw.wmnet
* 13:38 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb1002.eqiad.wmnet
* 13:36 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2012.codfw.wmnet
* 13:36 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2011.codfw.wmnet
* 13:36 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:30 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2003.codfw.wmnet
* 13:30 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1002.eqiad.wmnet
* 13:29 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudlb1002.eqiad.wmnet
* 13:29 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1002.eqiad.wmnet
* 13:28 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb1001.eqiad.wmnet
* 13:25 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:24 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2003.codfw.wmnet
* 13:21 jforrester@deploy2002: mwscript-k8s job started: extensions/WikimediaMaintenance/maintenance/createExtensionTables.php --wiki=abstractwiki translate # [[phab:T420656|T420656]]
* 13:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 13:20 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1001.eqiad.wmnet
* 13:20 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudlb1001.eqiad.wmnet
* 13:20 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1001.eqiad.wmnet
* 13:19 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:19 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudlb1001.eqiad.wmnet
* 13:18 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1001.eqiad.wmnet
* 13:17 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259035{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added (T420722)]], [[gerrit:1259036{{!}}tests: add coverage for WelcomeSurveyHooks::onCentralAuthPostLoginRedirect (T420722)]], [[gerrit:1259046{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added 2 (T420722)]] (duration: 11m 43s)
* 13:16 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:14 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:13 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:11 sgimeno@deploy2002: sgimeno: Continuing with sync
* 13:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2005-2006,2011-2018,2033-2039,2041-2042,2044,2046,2049-2051,2055-2062,2064-2065,2067-2078,2087-2095,2102-2115,2124-2179,2184-2199].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 13:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2186-2199].codfw.wmnet
* 13:08 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2186-2199].codfw.wmnet
* 13:07 sgimeno@deploy2002: sgimeno: Backport for [[gerrit:1259035{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added (T420722)]], [[gerrit:1259036{{!}}tests: add coverage for WelcomeSurveyHooks::onCentralAuthPostLoginRedirect (T420722)]], [[gerrit:1259046{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added 2 (T420722)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Ch
* 13:05 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1259035{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added (T420722)]], [[gerrit:1259036{{!}}tests: add coverage for WelcomeSurveyHooks::onCentralAuthPostLoginRedirect (T420722)]], [[gerrit:1259046{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added 2 (T420722)]]
* 12:43 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "bast4006 - ayounsi@cumin1003"
* 12:42 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "bast4006 - ayounsi@cumin1003"
* 12:42 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2006.codfw.wmnet with OS bookworm
* 12:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host bast4006.wikimedia.org
* 12:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast4006.wikimedia.org with OS bookworm
* 12:34 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:28 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
* 12:22 cmooney@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
* 12:18 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 12:16 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2186-2199].codfw.wmnet
* 12:14 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 12:08 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS bookworm
* 12:07 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2186-2199].codfw.wmnet
* 12:04 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 12:04 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 11:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 11:44 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 11:40 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:40 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:23 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2140-2153].codfw.wmnet
* 11:20 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS bookworm
* 11:20 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2140-2153].codfw.wmnet
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:19 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) bast4006.wikimedia.org on all recursors
* 11:19 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache bast4006.wikimedia.org on all recursors
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:19 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:15 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 11:15 ayounsi@cumin1003: START - Cookbook sre.ganeti.makevm for new host bast4006.wikimedia.org
* 11:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts install4003.wikimedia.org
* 11:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install4003.wikimedia.org decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
* 11:08 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install4003.wikimedia.org decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
* 11:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2140-2153].codfw.wmnet
* 11:05 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 11:00 ayounsi@cumin1003: START - Cookbook sre.hosts.decommission for hosts install4003.wikimedia.org
* 10:57 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:55 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2140-2153].codfw.wmnet
* 10:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:55 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:44 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:43 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:38 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 10:38 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:38 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:38 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 10:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:28 topranks: disable puppet on routed-ganeti hosts to test nftables update on specific nodes [[phab:T420715|T420715]]
* 10:27 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s1
* 10:25 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s1
* 10:25 ayounsi@dns1004: END - running authdns-update
* 10:24 ayounsi@dns1004: START - running authdns-update
* 10:23 btullis@cumin1003: START - Cookbook sre.hosts.provision for host an-worker1172.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 10:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:20 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s4
* 10:18 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s4
* 10:13 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s8
* 10:11 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s8
* 10:09 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1172.eqiad.wmnet with OS bullseye
* 10:08 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:05 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s7
* 10:05 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 10:04 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 10:04 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s7
* 09:58 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s3
* 09:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:57 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s3
* 09:53 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:53 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 09:52 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s2
* 09:49 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s2
* 09:49 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 09:49 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 09:49 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 09:49 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 09:44 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 09:44 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s5
* 09:44 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 09:44 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 09:42 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s5
* 09:40 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:39 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:33 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s6
* 09:32 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s6
* 09:29 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:29 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:25 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 09:25 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 09:24 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section es7
* 09:23 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section es7
* 09:22 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 09:22 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section es6
* 09:16 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section es6
* 09:11 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:11 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:10 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section x3
* 09:09 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section x3
* 09:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 09:05 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:02 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section x1
* 09:01 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section x1
* 09:00 federico3: starting [[phab:T416706|T416706]]
* 09:00 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 08:59 fceratto@cumin1003: END (FAIL) - Cookbook sre.switchdc.databases.prepare (exit_code=99) for the switch from eqiad to codfw for section test-s4
* 08:59 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from eqiad to codfw for section test-s4
* 08:59 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:59 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:50 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:46 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255736{{!}}hcaptcha: Use the global edit key for MobileFrontend edits if present (T420574)]] (duration: 14m 42s)
* 08:44 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 08:43 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 08:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:40 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:40 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:40 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:39 kharlan@deploy2002: kharlan: Continuing with sync
* 08:38 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:37 kharlan@deploy2002: kharlan: Backport for [[gerrit:1255736{{!}}hcaptcha: Use the global edit key for MobileFrontend edits if present (T420574)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 08:31 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1255736{{!}}hcaptcha: Use the global edit key for MobileFrontend edits if present (T420574)]]
* 08:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:19 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:18 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2005-2006,2011-2018,2033-2039,2041-2042,2044,2046,2049-2051,2055-2062,2064-2065,2067-2078,2087-2095,2102-2115,2124-2179,2184-2199].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 07:45 kartik@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254149{{!}}Enable ULS rewrite beta feature (T418187 T253303)]] (duration: 41m 30s)
* 07:42 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:33 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:30 kartik@deploy2002: kartik, abi: Continuing with sync
* 07:30 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:29 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:22 kartik@deploy2002: kartik, abi: Backport for [[gerrit:1254149{{!}}Enable ULS rewrite beta feature (T418187 T253303)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 07:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 07:17 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 07:16 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 07:03 kartik@deploy2002: Started scap sync-world: Backport for [[gerrit:1254149{{!}}Enable ULS rewrite beta feature (T418187 T253303)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 55s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-22 ==
* 02:50 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh7004.wikimedia.org with reason: depooled host
* 02:50 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh7003.wikimedia.org with reason: depooled host
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 21s)
* 02:01 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-20 ==
* 23:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2013.codfw.wmnet
* 23:30 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs2013.codfw.wmnet
* 22:34 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host lvs2013.codfw.wmnet
* 22:34 brett: Started pybal on lvs2013
* 22:27 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 21:57 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: trixie reimaging]
* 21:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5023.eqsin.wmnet with OS trixie
* 21:55 hashar: Upgrading CI Jenkins [[phab:T420477|T420477]]
* 21:25 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5023.eqsin.wmnet with reason: host reimage
* 21:21 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5023.eqsin.wmnet with reason: host reimage
* 21:04 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: debugging ipip
* 20:46 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5023.eqsin.wmnet with OS trixie
* 20:45 mutante: contint1003/2003 apt remove --purge apache2* ; apt remove --purge php* {{!}} [[phab:T418521|T418521]]
* 20:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 20:40 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 20:38 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5023.eqsin.wmnet with OS trixie
* 20:24 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh3006.wikimedia.org with reason: depooled host
* 20:24 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh3005.wikimedia.org with reason: depooled host
* 20:23 sukhe@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on doh3005.wikimedia.org with reason: depooled host
* 19:50 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: debugging ipip
* 19:33 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 19:30 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 19:21 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling reboot on A:tcpproxy and A:tcpproxy
* 19:16 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5023.eqsin.wmnet with OS trixie
* 19:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5023.eqsin.wmnet [reason: trixie reimaging]
* 19:16 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5021.eqsin.wmnet [reason: trixie reimaging]
* 19:14 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5021.eqsin.wmnet with OS trixie
* 18:52 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: reboot
* 18:43 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5021.eqsin.wmnet with reason: host reimage
* 18:39 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5021.eqsin.wmnet with reason: host reimage
* 18:28 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: reboot
* 18:16 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling reboot on A:tcpproxy and A:tcpproxy
* 18:14 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on db1253.eqiad.wmnet with reason: [[phab:T420041|T420041]]
* 17:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5021.eqsin.wmnet with OS trixie
* 17:54 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5021.eqsin.wmnet with OS trixie
* 17:51 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host lvs2014.codfw.wmnet
* 17:40 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on contint1003.wikimedia.org with reason: jenkins on java21
* 17:39 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet
* 16:54 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:54 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:33 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5021.eqsin.wmnet with OS trixie
* 16:32 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5021.eqsin.wmnet [reason: trixie reimaging]
* 16:09 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:08 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2002.codfw.wmnet
* 16:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2002.codfw.wmnet
* 15:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2001.codfw.wmnet
* 15:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:45 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2001.codfw.wmnet
* 15:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2041.codfw.wmnet
* 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2041.codfw.wmnet
* 15:32 cparle@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:32 cparle@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 15:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2040.codfw.wmnet
* 15:10 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2040.codfw.wmnet
* 15:02 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 15:01 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 15:00 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:59 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:58 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:58 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:57 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:56 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:55 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:50 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2002].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2002.codfw.wmnet
* 14:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2002.codfw.wmnet
* 14:44 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:44 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2002.codfw.wmnet
* 14:37 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2002.codfw.wmnet
* 14:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2001.codfw.wmnet
* 14:37 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2001.codfw.wmnet
* 14:36 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:35 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:35 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:34 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:30 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2001.codfw.wmnet
* 14:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2001.codfw.wmnet
* 14:29 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2002].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 14:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1335-1349].eqiad.wmnet
* 14:27 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1335-1349].eqiad.wmnet
* 14:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2039.codfw.wmnet
* 14:21 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2039.codfw.wmnet
* 14:16 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:16 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2038.codfw.wmnet
* 14:08 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2038.codfw.wmnet
* 13:54 jgreen@dns1004: END - running authdns-update
* 13:52 jgreen@dns1004: START - running authdns-update
* 13:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:39 inflatador: bking@deploy2002 restarting opensearch-ipoid cluster to apply new certificates
* 13:33 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 13:20 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:14 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-canary
* 13:14 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for doh[3005-3006].wikimedia.org
* 13:14 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for doh[3005-3006].wikimedia.org
* 13:08 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-canary
* 13:05 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:03 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:58 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2006.codfw.wmnet
* 12:56 cparle@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 12:55 cparle@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 12:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2006.codfw.wmnet
* 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2001.codfw.wmnet
* 12:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2001.codfw.wmnet
* 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1005.eqiad.wmnet
* 12:35 jiji@cumin1003: END (ERROR) - Cookbook sre.memcached.roll-reboot-restart (exit_code=97) rolling reboot on A:memcached-codfw
* 12:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1005.eqiad.wmnet
* 12:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw2001.wikimedia.org
* 12:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw2001.wikimedia.org
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw1001.wikimedia.org
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw1001.wikimedia.org
* 11:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 11:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 11:27 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 11:24 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 10:26 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 10:13 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:12 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:10 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:04 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:02 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:58 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:56 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:56 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:55 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:53 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:50 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:46 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:45 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:37 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:36 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:36 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:35 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:35 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:34 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:34 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:33 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:26 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:25 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:24 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:23 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:19 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:19 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:18 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:18 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:18 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:15 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:57 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 05:30 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5024.eqsin.wmnet [reason: trixie reimaging]
* 05:30 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5019.eqsin.wmnet [reason: trixie reimaging]
* 02:43 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on doh3005.wikimedia.org with reason: alerting is flapping
* 02:42 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on doh3006.wikimedia.org with reason: alerting is flapping
* 01:21 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5019.eqsin.wmnet with OS trixie
* 01:17 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5024.eqsin.wmnet with OS trixie
* 00:48 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
* 00:44 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 00:38 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
* 00:37 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 00:05 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
* 00:01 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5024.eqsin.wmnet with OS trixie
* 00:01 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
== 2026-03-19 ==
* 23:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5019.eqsin.wmnet with OS trixie
* 23:40 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255801{{!}}Make the handler follow the thumb steps (T414805)]] (duration: 06m 14s)
* 23:36 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 23:35 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1255801{{!}}Make the handler follow the thumb steps (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:33 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1255801{{!}}Make the handler follow the thumb steps (T414805)]]
* 22:48 zabe@deploy2002: mwscript-k8s job started: foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https # [[phab:T420643|T420643]]
* 22:19 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 22:18 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 22:17 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 22:16 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 22:08 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255886{{!}}Set WikiLambdaAbstractNamespaces's merge_strategy to provide_default (T420649)]] (duration: 06m 46s)
* 22:04 jforrester@deploy2002: jforrester: Continuing with sync
* 22:03 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255886{{!}}Set WikiLambdaAbstractNamespaces's merge_strategy to provide_default (T420649)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:01 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255886{{!}}Set WikiLambdaAbstractNamespaces's merge_strategy to provide_default (T420649)]]
* 21:58 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5019.eqsin.wmnet with OS trixie
* 21:57 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase-codfw
* 21:57 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5019.eqsin.wmnet [reason: trixie reimaging]
* 21:56 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
* 21:56 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5024.eqsin.wmnet [reason: trixie reimaging]
* 21:55 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255765{{!}}Implement addListener fallback for older browsers in matchMedia (T419717)]] (duration: 07m 17s)
* 21:51 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:49 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1255765{{!}}Implement addListener fallback for older browsers in matchMedia (T419717)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:48 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255765{{!}}Implement addListener fallback for older browsers in matchMedia (T419717)]]
* 21:29 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255881{{!}}Skins: Address issue with blurry images for large thumbnails (T375981)]] (duration: 07m 03s)
* 21:25 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:24 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1255881{{!}}Skins: Address issue with blurry images for large thumbnails (T375981)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:22 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255881{{!}}Skins: Address issue with blurry images for large thumbnails (T375981)]]
* 21:11 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup2020.codfw.wmnet with reason: kernel module reload
* 21:10 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 11 hosts with reason: kernel module reload
* 20:36 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254865{{!}}Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654{{!}}Deploy PRV to 13 wikis (T420273)]] (duration: 11m 00s)
* 20:32 kgraessle@deploy2002: kgraessle, arlolra: Continuing with sync
* 20:27 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1016.eqiad.wmnet
* 20:27 kgraessle@deploy2002: kgraessle, arlolra: Backport for [[gerrit:1254865{{!}}Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654{{!}}Deploy PRV to 13 wikis (T420273)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1254865{{!}}Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654{{!}}Deploy PRV to 13 wikis (T420273)]]
* 20:25 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1016.eqiad.wmnet
* 20:11 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs1016.eqiad.wmnet with reason: reboot
* 20:01 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:01 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add analytic vlan hostnames - cmooney@cumin1003"
* 20:01 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add analytic vlan hostnames - cmooney@cumin1003"
* 19:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1018.eqiad.wmnet
* 19:56 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:56 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1018.eqiad.wmnet
* 19:55 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:53 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 4.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.8.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa. on all recursors
* 19:53 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache 4.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.8.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa. on all recursors
* 19:52 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:51 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 7 hosts with reason: kernel module reload
* 19:44 topranks: disable IPv6 VRRP for et-1/0/5.1023 sub-interfaces on eqiad core routers [[phab:T405562|T405562]]
* 19:36 brett: stopping pybal/puppet on lvs1018 for reboots
* 19:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: reboots
* 19:00 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 6 hosts with reason: kernel module reload
* 19:00 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1019.eqiad.wmnet
* 19:00 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase-codfw
* 19:00 topranks: add vlan sub-interface for analytics1-d-eqiad vlan to leaf switches in eqiad row d [[phab:T405562|T405562]]
* 18:44 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs1019.eqiad.wmnet with reason: planned reboot
* 18:42 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:aqs-codfw
* 18:31 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255824{{!}}RepoBooks::onMediaWikiServices: Skip all low NSes, not just NS0 (T420617)]], [[gerrit:1255820{{!}}SpecialAbstractContent: Fix hard-coded policy list page namespace]], [[gerrit:1255794{{!}}[abstractwiki] Allow "Abstract:" as well as "Abstract Wikipedia:" as NS_PROJECT]] (duration: 06m 20s)
* 18:27 jforrester@deploy2002: jforrester: Continuing with sync
* 18:26 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255824{{!}}RepoBooks::onMediaWikiServices: Skip all low NSes, not just NS0 (T420617)]], [[gerrit:1255820{{!}}SpecialAbstractContent: Fix hard-coded policy list page namespace]], [[gerrit:1255794{{!}}[abstractwiki] Allow "Abstract:" as well as "Abstract Wikipedia:" as NS_PROJECT]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now b
* 18:24 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255824{{!}}RepoBooks::onMediaWikiServices: Skip all low NSes, not just NS0 (T420617)]], [[gerrit:1255820{{!}}SpecialAbstractContent: Fix hard-coded policy list page namespace]], [[gerrit:1255794{{!}}[abstractwiki] Allow "Abstract:" as well as "Abstract Wikipedia:" as NS_PROJECT]]
* 18:02 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
* 17:55 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 17:46 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:46 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:45 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host lvs1020.eqiad.wmnet
* 17:44 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 17:30 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 17:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy4004.wikimedia.org
* 17:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy4004.wikimedia.org with OS bookworm
* 17:26 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on contint1003.wikimedia.org with reason: jenkins on java21
* 17:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp5026.eqsin.wmnet
* 17:22 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:22 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:22 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:22 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:21 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp5026.eqsin.wmnet
* 17:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy4004.wikimedia.org with reason: host reimage
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts doh4002.wikimedia.org
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 17:08 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy4004.wikimedia.org with reason: host reimage
* 17:08 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 17:07 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:07 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:05 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp5026.eqsin.wmnet with reason: firmware updates
* 17:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 17:03 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5025.*
* 17:01 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp5025.eqsin.wmnet
* 16:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts doh4002.wikimedia.org
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts doh4001.wikimedia.org
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 16:59 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 16:58 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:57 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:57 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:56 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp5025.eqsin.wmnet
* 16:50 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts doh4001.wikimedia.org
* 16:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-master1004.eqiad.wmnet
* 16:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1151.eqiad.wmnet
* 16:45 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5026.eqsin.wmnet with OS trixie
* 16:44 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy4004.wikimedia.org with OS bookworm
* 16:44 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy4004.wikimedia.org on all recursors
* 16:43 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy4004.wikimedia.org on all recursors
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:42 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:42 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255039{{!}}Activate Abstract Wikipedia (T411723)]] (duration: 06m 09s)
* 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp5025.eqsin.wmnet with reason: firmware updates
* 16:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-master1004.eqiad.wmnet
* 16:40 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5025.eqsin.wmnet with OS trixie
* 16:39 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1151.eqiad.wmnet
* 16:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:39 jmm@cumin2002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 16:38 jforrester@deploy2002: jforrester: Continuing with sync
* 16:38 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255039{{!}}Activate Abstract Wikipedia (T411723)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:36 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255039{{!}}Activate Abstract Wikipedia (T411723)]]
* 16:35 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
* 16:33 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255779{{!}}Revert "[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki"]] (duration: 07m 19s)
* 16:29 jforrester@deploy2002: jforrester: Continuing with sync
* 16:28 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255779{{!}}Revert "[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:26 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255779{{!}}Revert "[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki"]]
* 16:25 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:24 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255773{{!}}[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki (T420531)]] (duration: 06m 06s)
* 16:23 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:aqs-codfw
* 16:21 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 16:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 16:20 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:20 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4004.wikimedia.org
* 16:20 fabfur@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=1) rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp2041*<nowiki>}</nowiki> and A:cp - 3.2 test upgrade ()
* 16:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp2041*<nowiki>}</nowiki> and A:cp - 3.2 test upgrade ()
* 16:20 jforrester@deploy2002: jforrester: Continuing with sync
* 16:19 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255773{{!}}[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki (T420531)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2003.codfw.wmnet
* 16:17 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255773{{!}}[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki (T420531)]]
* 16:17 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 16:17 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 16:17 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy4003.wikimedia.org
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy4003.wikimedia.org with OS bookworm
* 16:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1142.eqiad.wmnet
* 16:14 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk2003.codfw.wmnet
* 16:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2002.codfw.wmnet
* 16:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 16:10 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk2002.codfw.wmnet
* 16:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5025.eqsin.wmnet with reason: host reimage
* 16:08 brouberol@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:07 brouberol@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1142.eqiad.wmnet
* 16:06 brouberol@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2001.codfw.wmnet
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5025.eqsin.wmnet with reason: host reimage
* 16:05 brouberol@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk2001.codfw.wmnet
* 15:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy4003.wikimedia.org with reason: host reimage
* 15:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy4003.wikimedia.org with reason: host reimage
* 15:36 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief2002.codfw.wmnet
* 15:35 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5025.eqsin.wmnet with OS trixie
* 15:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5026.eqsin.wmnet with OS trixie
* 15:34 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5025.eqsin.wmnet with OS trixie
* 15:34 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5026.eqsin.wmnet with OS trixie
* 15:32 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief2002.codfw.wmnet
* 15:32 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief1002.eqiad.wmnet
* 15:32 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy4003.wikimedia.org with OS bookworm
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy4003.wikimedia.org on all recursors
* 15:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy4003.wikimedia.org on all recursors
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test2001.codfw.wmnet
* 15:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test1001.eqiad.wmnet
* 15:28 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief1002.eqiad.wmnet
* 15:26 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test2001.codfw.wmnet
* 15:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test1001.eqiad.wmnet
* 15:22 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255747{{!}}Hooks: Re-apply I52fc151ab88d79754baeff35d2c0f200ebe9fc9a]] (duration: 09m 55s)
* 15:22 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:22 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:22 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:21 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:21 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:21 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:18 phuedx@deploy2002: phuedx: Continuing with sync
* 15:18 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:17 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:17 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 15:16 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 15:16 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 15:14 phuedx@deploy2002: phuedx: Backport for [[gerrit:1255747{{!}}Hooks: Re-apply I52fc151ab88d79754baeff35d2c0f200ebe9fc9a]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 15:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4003.wikimedia.org
* 15:12 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1255747{{!}}Hooks: Re-apply I52fc151ab88d79754baeff35d2c0f200ebe9fc9a]]
* 15:11 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 15:10 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 15:10 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 15:09 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 15:09 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 15:09 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 15:07 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5026.eqsin.wmnet with OS trixie
* 15:06 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5025.eqsin.wmnet with OS trixie
* 15:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh4004.wikimedia.org
* 15:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh4004.wikimedia.org with OS bookworm
* 15:05 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk1003.eqiad.wmnet
* 15:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk1003.eqiad.wmnet
* 14:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk1002.eqiad.wmnet
* 14:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host matomo1003.eqiad.wmnet
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zookeeper-test1002.eqiad.wmnet
* 14:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk1002.eqiad.wmnet
* 14:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk1001.eqiad.wmnet
* 14:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host matomo1003.eqiad.wmnet
* 14:52 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host zookeeper-test1002.eqiad.wmnet
* 14:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk1001.eqiad.wmnet
* 14:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1006.eqiad.wmnet
* 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh4004.wikimedia.org with reason: host reimage
* 14:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-conf1006.eqiad.wmnet
* 14:43 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh4004.wikimedia.org with reason: host reimage
* 14:40 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 14:38 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 14:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1005.eqiad.wmnet
* 14:32 bking@cumin2002: conftool action : set/pooled=yes:weight=1; selector: name=dse-k8s-worker1010.eqiad.wmnet{{!}}dse-k8s-worker1011.eqiad.wmnet{{!}}dse-k8s-worker1012.eqiad.wmnet{{!}}dse-k8s-worker1013.eqiad.wmnet{{!}}dse-k8s-worker1015.eqiad.wmnet{{!}}dse-k8s-worker1016.eqiad.wmnet{{!}}dse-k8s-worker1017.eqiad.wmnet{{!}}dse-k8s-worker1018.eqiad.wmnet{{!}}dse-k8s-worker1019.eqiad.wmnet
* 14:29 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-conf1005.eqiad.wmnet
* 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1004.eqiad.wmnet
* 14:25 bking@cumin2002: conftool action : set/pooled=yes:weight=10; selector: name=dse-k8s-worker1012.eqiad.wmnet{{!}}dse-k8s-worker1015.eqiad.wmnet{{!}}dse-k8s-worker1016.eqiad.wmnet{{!}}dse-k8s-worker1017.eqiad.wmnet
* 14:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-conf1004.eqiad.wmnet
* 14:21 jmm@deploy2002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 14:20 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host doh4004.wikimedia.org with OS bookworm
* 14:20 jmm@deploy2002: helmfile [eqiad] START helmfile.d/services/proton: apply
* 14:19 jmm@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 14:18 jmm@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
* 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:17 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4004.wikimedia.org on all recursors
* 14:17 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4004.wikimedia.org on all recursors
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:13 jmm@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
* 14:12 jmm@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
* 14:11 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:04 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4004.wikimedia.org
* 14:03 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh4003.wikimedia.org
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh4003.wikimedia.org with OS bookworm
* 13:53 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:46 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250107{{!}}Expose new wikifunctions.v0 REST API module on Wikifunctions.org only (T419053)]] (duration: 06m 03s)
* 13:42 jforrester@deploy2002: jforrester: Continuing with sync
* 13:42 jforrester@deploy2002: jforrester: Backport for [[gerrit:1250107{{!}}Expose new wikifunctions.v0 REST API module on Wikifunctions.org only (T419053)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:40 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1250107{{!}}Expose new wikifunctions.v0 REST API module on Wikifunctions.org only (T419053)]]
* 13:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh4003.wikimedia.org with reason: host reimage
* 13:33 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh4003.wikimedia.org with reason: host reimage
* 13:22 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255686{{!}}CreateAccount: Add class to aide in instrumentation]], [[gerrit:1255685{{!}}createAccount: Log exposure and CTRs for account creation experiment (T419916)]] (duration: 12m 58s)
* 13:22 moritzm: upgrade rpki1001 to Routinator 0.15.1 [[phab:T420572|T420572]]
* 13:15 urbanecm@deploy2002: migr, urbanecm: Continuing with sync
* 13:13 urbanecm@deploy2002: migr, urbanecm: Backport for [[gerrit:1255686{{!}}CreateAccount: Add class to aide in instrumentation]], [[gerrit:1255685{{!}}createAccount: Log exposure and CTRs for account creation experiment (T419916)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:12 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host doh4003.wikimedia.org with OS bookworm
* 13:09 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1255686{{!}}CreateAccount: Add class to aide in instrumentation]], [[gerrit:1255685{{!}}createAccount: Log exposure and CTRs for account creation experiment (T419916)]]
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4003.wikimedia.org - jmm@cumin2002"
* 13:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4003.wikimedia.org - jmm@cumin2002"
* 13:07 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 17 hosts with reason: upgrade
* 13:01 moritzm: installing rsync security updates
* 12:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1017.eqiad.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm7001.magru.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:59 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 12:57 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 12:54 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 12:53 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1017.eqiad.wmnet
* 12:53 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1016.eqiad.wmnet
* 12:52 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 12:52 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 12:51 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 12:51 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 12:50 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 12:50 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 12:50 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 12:50 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 12:49 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 12:48 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 12:48 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 12:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1016.eqiad.wmnet
* 12:47 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:46 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:46 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 12:46 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:46 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 12:43 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 12:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 12:43 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb2002.codfw.wmnet
* 12:41 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 12:41 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm7001.magru.wmnet
* 12:41 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 12:41 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 12:40 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 12:40 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 12:39 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 12:39 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 12:38 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 12:37 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 12:37 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:37 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
* 12:37 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4003.wikimedia.org
* 12:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb2002.codfw.wmnet
* 12:29 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:27 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:25 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:25 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:24 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:23 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:22 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:22 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:10 urbanecm@deploy2002: mwscript-k8s job started: GrowthExperiments:reassignMentees --wiki=enwiki --mentor=Bilorv --performer=Bilorv --as-job # [[phab:T418194|T418194]]
* 11:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 11:59 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:58 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
* 11:57 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 11:53 moritzm: upgrade rpki2003 to Routinator 0.15.1 [[phab:T420572|T420572]]
* 11:46 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:40 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:40 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1017.eqiad.wmnet with reason: host reimage
* 11:26 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1017.eqiad.wmnet with reason: host reimage
* 11:18 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 11:18 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
* 11:11 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 11:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-codfw
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 10:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 10:55 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5]*<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 10:54 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host doh4003.wikimedia.org
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 10:54 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2004.codfw.wmnet
* 10:51 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 10:50 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 10:50 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:50 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2004.codfw.wmnet
* 10:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2005.codfw.wmnet
* 10:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema2004.codfw.wmnet
* 10:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2005.codfw.wmnet
* 10:43 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:43 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4003.wikimedia.org
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4007.ulsfo.wmnet to cluster ulsfo02 and group 01
* 10:42 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4007.ulsfo.wmnet to cluster ulsfo02 and group 01
* 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema2004.codfw.wmnet
* 10:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema2003.codfw.wmnet
* 10:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema2003.codfw.wmnet
* 10:37 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1002.wikimedia.org
* 10:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema1004.eqiad.wmnet
* 10:36 Raine: created temporary categorylinks_icu72 tables -- [[phab:T419980|T419980]], [[phab:T419049|T419049]]
* 10:36 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 10:34 aokoth@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:33 aokoth@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:32 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema1004.eqiad.wmnet
* 10:32 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:31 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4007.ulsfo.wmnet
* 10:29 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5]*<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 10:28 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:28 fnegri@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1002.wikimedia.org
* 10:26 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2006.codfw.wmnet
* 10:25 btullis@cumin1003: END (PASS) - Cookbook sre.opensearch.roll-restart-reboot (exit_code=0) rolling reboot on A:datahubsearch
* 10:24 btullis@cumin1003: END (FAIL) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=99) rolling reboot on A:cephosd-eqiad
* 10:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4007.ulsfo.wmnet
* 10:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2006.codfw.wmnet
* 10:21 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1001.wikimedia.org
* 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2008.wikimedia.org
* 10:19 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:18 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2008.wikimedia.org
* 10:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2007.codfw.wmnet
* 10:13 fnegri@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1001.wikimedia.org
* 10:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2007.codfw.wmnet
* 10:09 btullis@cumin1003: START - Cookbook sre.opensearch.roll-restart-reboot rolling reboot on A:datahubsearch
* 10:04 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:03 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4007.ulsfo.wmnet with OS bookworm
* 09:58 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 17 hosts with reason: upgrade
* 09:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema1003.eqiad.wmnet
* 09:53 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema1003.eqiad.wmnet
* 09:46 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]] (duration: 01m 07s)
* 09:45 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]]
* 09:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 09:43 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]] (duration: 00m 59s)
* 09:42 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]]
* 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4007.ulsfo.wmnet with reason: host reimage
* 09:35 moritzm: installing libnginx-mod-http-lua security updates
* 09:35 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4007.ulsfo.wmnet with reason: host reimage
* 09:29 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 09:26 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 09:26 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 09:24 klausman@cumin2002: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-codfw
* 09:21 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:21 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:19 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:19 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:13 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4007.ulsfo.wmnet with OS bookworm
* 09:11 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 09:01 moritzm: remove ganeti4007 from classic Ganeti cluster in ulsfo [[phab:T418993|T418993]]
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of doh4001.wikimedia.org to plain
* 08:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of doh4001.wikimedia.org to plain
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of doh4002.wikimedia.org to plain
* 08:46 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of doh4002.wikimedia.org to plain
* 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4001.wikimedia.org to plain
* 08:45 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4001.wikimedia.org to plain
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4002.wikimedia.org to plain
* 08:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4002.wikimedia.org to plain
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of install4003.wikimedia.org to plain
* 08:42 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of install4003.wikimedia.org to plain
* 08:40 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:38 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host doh4003.wikimedia.org
* 08:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 08:38 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 08:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:38 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:38 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 08:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:31 moritzm: installing python-apt security updates
* 08:29 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:29 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4003.wikimedia.org
* 08:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 08:14 moritzm: installing imagemagick security updates on Bullseye
* 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
* 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
* 08:12 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 08:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
* 07:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
* 07:17 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 07:17 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 07:16 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 07:16 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 07:14 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 07:14 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 04:53 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 00:06 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon2003.codfw.wmnet
* 00:02 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon2003.codfw.wmnet
* 00:01 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon1003.eqiad.wmnet
== 2026-03-18 ==
* 23:58 mutante: releases2003 - kill 782 (stunnel4) - systemctl start stunnel4 - fix [[phab:T420246|T420246]] [[phab:T420388|T420388]] [[phab:T420411|T420411]]
* 23:57 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon1003.eqiad.wmnet
* 23:49 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:cassandra-dev
* 23:23 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:cassandra-dev
* 23:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5017.*
* 23:02 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5020.*
* 23:01 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5028.*
* 22:40 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5028.eqsin.wmnet with OS trixie
* 22:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS trixie
* 22:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 22:04 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 21:51 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-eqiad
* 21:49 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox
* 21:49 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7002.wikimedia.org
* 21:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 21:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5027.*
* 21:40 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 21:31 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 21:30 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5028.eqsin.wmnet with OS trixie
* 21:30 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7002.wikimedia.org
* 21:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5027.eqsin.wmnet with OS trixie
* 21:27 jforrester@deploy2002: mwscript-k8s job started: extensions/WikimediaMaintenance/maintenance/addWiki.php --wiki=abstractwiki # [[phab:T411723|T411723]] addWiki.php run
* 21:26 jforrester@deploy2002: mwscript-k8s job started: extensions/WikimediaMaintenance/maintenance/addWiki.php --wiki=abstractwiki # [[phab:T411723|T411723]] addWiki.php run
* 21:24 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255034{{!}}Revert "OrchestratorRequest: Switch evaluations to v2 endpoint" (T418887)]], [[gerrit:1247650{{!}}Create Abstract Wikipedia (T411725 T411726)]] (duration: 06m 44s)
* 21:20 jforrester@deploy2002: jforrester: Continuing with sync
* 21:20 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255034{{!}}Revert "OrchestratorRequest: Switch evaluations to v2 endpoint" (T418887)]], [[gerrit:1247650{{!}}Create Abstract Wikipedia (T411725 T411726)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:17 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255034{{!}}Revert "OrchestratorRequest: Switch evaluations to v2 endpoint" (T418887)]], [[gerrit:1247650{{!}}Create Abstract Wikipedia (T411725 T411726)]]
* 21:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5017.eqsin.wmnet with OS trixie
* 21:15 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7001.wikimedia.org
* 21:12 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS trixie
* 21:08 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255013{{!}}Guard for JS null deref on empty Parsoid sections (T419721)]], [[gerrit:1254889{{!}}Reapply "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] (duration: 11m 20s)
* 21:07 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 21:07 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5028.eqsin.wmnet with OS trixie
* 21:04 jdlrobson@deploy2002: jdlrobson, harroyo-wmf: Continuing with sync
* 20:59 jdlrobson@deploy2002: jdlrobson, harroyo-wmf: Backport for [[gerrit:1255013{{!}}Guard for JS null deref on empty Parsoid sections (T419721)]], [[gerrit:1254889{{!}}Reapply "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:59 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-eqiad
* 20:58 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7001.wikimedia.org
* 20:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5027.eqsin.wmnet with reason: host reimage
* 20:57 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255013{{!}}Guard for JS null deref on empty Parsoid sections (T419721)]], [[gerrit:1254889{{!}}Reapply "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]]
* 20:52 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-codfw
* 20:51 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5027.eqsin.wmnet with reason: host reimage
* 20:51 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in1001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:50 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5020.eqsin.wmnet with OS trixie
* 20:50 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in2001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:49 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 20:48 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5028.eqsin.wmnet with OS trixie
* 20:44 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5017.eqsin.wmnet with reason: host reimage
* 20:43 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns6002.wikimedia.org
* 20:42 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out1001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:42 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1033.eqiad.wmnet with OS trixie
* 20:42 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 20:42 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 20:38 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254956{{!}}Limit legacy postprocessing cache to pages where DT does apply (T376183)]] (duration: 13m 54s)
* 20:37 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5017.eqsin.wmnet with reason: host reimage
* 20:35 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out2001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:34 cscott@deploy2002: cscott: Continuing with sync
* 20:33 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns6002.wikimedia.org
* 20:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and not P<nowiki>{</nowiki>cp2042.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 20:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
* 20:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and not P<nowiki>{</nowiki>cp2041.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 20:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
* 20:26 cscott@deploy2002: cscott: Backport for [[gerrit:1254956{{!}}Limit legacy postprocessing cache to pages where DT does apply (T376183)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1033.eqiad.wmnet with reason: host reimage
* 20:24 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1254956{{!}}Limit legacy postprocessing cache to pages where DT does apply (T376183)]]
* 20:24 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5027.eqsin.wmnet with OS trixie
* 20:22 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5029.*
* 20:21 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5029.eqsin.wmnet with OS trixie
* 20:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1033.eqiad.wmnet with reason: host reimage
* 20:18 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns6001.wikimedia.org
* 20:14 kemayo@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254965{{!}}Editcheck: fix tagging not happening for non-default checks]] (duration: 06m 28s)
* 20:10 kemayo@deploy2002: kemayo: Continuing with sync
* 20:10 kemayo@deploy2002: kemayo: Backport for [[gerrit:1254965{{!}}Editcheck: fix tagging not happening for non-default checks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1033.eqiad.wmnet with OS trixie
* 20:08 kemayo@deploy2002: Started scap sync-world: Backport for [[gerrit:1254965{{!}}Editcheck: fix tagging not happening for non-default checks]]
* 20:06 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 20:05 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns6001.wikimedia.org
* 20:05 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5017.eqsin.wmnet with OS trixie
* 20:05 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.*
* 20:05 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw
* 19:51 Reedy: running `foreachwikiindblist fishbowl.dblist extensions/OATHAuth/maintenance/UpdateSecretsToEncryptedFormat.php` [[phab:T404363|T404363]]
* 19:51 Reedy: running `foreachwikiindblist private.dblist extensions/OATHAuth/maintenance/UpdateSecretsToEncryptedFormat.php` [[phab:T404363|T404363]]
* 19:50 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns5004.wikimedia.org
* 19:50 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
* 19:50 Reedy: running `mwscript extensions/OATHAuth/maintenance/UpdateSecretsToEncryptedFormat.php --wiki=metawiki` [[phab:T404363|T404363]]
* 19:49 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
* 19:49 reedy@deploy2002: Synchronized private/PrivateSettings.php: Set $wgOATHSecretKey [[phab:T404363|T404363]] (duration: 05m 51s)
* 19:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
* 19:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5030.eqsin.wmnet with OS trixie
* 19:42 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
* 19:39 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5017.eqsin.wmnet with OS trixie
* 19:35 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 19:35 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 19:33 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns5004.wikimedia.org
* 19:30 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install4004.wikimedia.org with OS bookworm
* 19:29 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS trixie
* 19:28 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5020.eqsin.wmnet [reason: trixie reimaging]
* 19:28 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5018.eqsin.wmnet [reason: trixie reimaging]
* 19:27 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 19:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 19:26 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5018.eqsin.wmnet with OS trixie
* 19:23 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:23 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:18 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns5003.wikimedia.org
* 19:17 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:13 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:13 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install4004.wikimedia.org with reason: host reimage
* 19:11 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
* 19:08 brett@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp5029.eqsin.wmnet
* 19:08 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
* 19:08 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS trixie
* 19:08 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on install4004.wikimedia.org with reason: host reimage
* 19:02 brett@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp5029.eqsin.wmnet
* 19:01 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns5003.wikimedia.org
* 18:56 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5031.*
* 18:56 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5031.eqsin.wmnet with OS trixie
* 18:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 18:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 18:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5030.eqsin.wmnet with OS trixie
* 18:46 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns4004.wikimedia.org
* 18:45 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host install4004.wikimedia.org with OS bookworm
* 18:45 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 18:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
* 18:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
* 18:29 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
* 18:27 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns4004.wikimedia.org
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
* 18:18 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5017.eqsin.wmnet with OS trixie
* 18:17 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
* 18:17 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5017.eqsin.wmnet [reason: trixie reimaging]
* 18:17 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5018.eqsin.wmnet with OS trixie
* 18:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5018.eqsin.wmnet [reason: trixie reimaging]
* 18:16 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5017.eqsin.wmnet [reason: trixie reimaging]
* 18:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5017.eqsin.wmnet [reason: trixie reimaging]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3077.esams.wmnet [reason: trixie reimaging]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 18:12 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 18:12 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns4003.wikimedia.org
* 18:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1258: Ready
* 18:07 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 18:01 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 17:59 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3078.esams.wmnet with OS trixie
* 17:56 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3077.esams.wmnet with OS trixie
* 17:55 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns4003.wikimedia.org
* 17:54 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
* 17:51 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
* 17:43 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:42 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:40 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns3004.wikimedia.org
* 17:39 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:38 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backupmon1001.eqiad.wmnet with reason: upgrade
* 17:35 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1347.eqiad.wmnet with OS trixie
* 17:33 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 17:32 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 17:32 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5031.eqsin.wmnet with OS trixie
* 17:32 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5032.eqsin.wmnet with OS trixie
* 17:32 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5031.eqsin.wmnet with OS trixie
* 17:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 17:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 17:30 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3077.esams.wmnet with reason: host reimage
* 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:29 claime: rearmed keyholder on deploy1003
* 17:28 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:27 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 17:26 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3077.esams.wmnet with reason: host reimage
* 17:25 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:25 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1258: Ready
* 17:23 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns3004.wikimedia.org
* 17:21 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 17:21 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 17:21 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:20 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:20 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-esams and A:ncredir
* 17:19 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1347.eqiad.wmnet with reason: host reimage
* 17:18 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-drmrs and A:ncredir
* 17:16 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-eqiad and A:ncredir
* 17:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-ulsfo and A:ncredir
* 17:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
* 17:15 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 17:14 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1347.eqiad.wmnet with reason: host reimage
* 17:14 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5031.eqsin.wmnet with OS trixie
* 17:12 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 17:12 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
* 17:11 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 17:09 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 17:09 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3078.*
* 17:08 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 17:08 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 17:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3079.*
* 17:08 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns3003.wikimedia.org
* 17:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3078.*
* 17:07 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-eqiad and A:ncredir
* 17:07 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
* 17:07 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-esams and A:ncredir
* 17:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
* 17:06 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir2002.*
* 17:05 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-drmrs and A:ncredir
* 17:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir2002.codfw.wmnet
* 17:05 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-eqsin and A:ncredir
* 17:05 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-ulsfo and A:ncredir
* 17:04 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-magru and A:ncredir
* 17:03 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3078.esams.wmnet with OS trixie
* 17:02 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1347
* 17:02 jayme@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1347
* 17:02 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 17:01 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3076.esams.wmnet [reason: trixie reimaging]
* 17:01 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3077.esams.wmnet with OS trixie
* 17:01 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3077.esams.wmnet [reason: trixie reimaging]
* 16:59 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncredir2002.codfw.wmnet
* 16:58 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir2002.*
* 16:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 8 hosts with reason: upgrade
* 16:55 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir2001.*
* 16:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ncredir2001.codfw.wmnet
* 16:55 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for ncredir2001.codfw.wmnet
* 16:55 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3076.esams.wmnet with OS trixie
* 16:53 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2014.codfw.wmnet
* 16:52 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-eqsin and A:ncredir
* 16:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2008.codfw.wmnet with reason: kernel update
* 16:51 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir
* 16:51 klausman@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on ml-serve1013.eqiad.wmnet with reason: Reboot for security update
* 16:50 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2013.codfw.wmnet
* 16:49 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir2001.*
* 16:49 brett@cumin2002: END (ERROR) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=97) rolling reboot on A:ncredir and A:ncredir
* 16:48 jayme@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1347
* 16:48 jayme@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1347.eqiad.wmnet 199.48.64.10.in-addr.arpa 9.9.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 16:48 jayme@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1347.eqiad.wmnet 199.48.64.10.in-addr.arpa 9.9.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 16:48 jayme@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:47 jayme@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1347 - jayme@cumin1003"
* 16:47 jayme@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1347 - jayme@cumin1003"
* 16:47 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns3003.wikimedia.org
* 16:47 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1012.eqiad.wmnet
* 16:47 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and A:ncredir
* 16:47 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2012.codfw.wmnet
* 16:47 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2014.codfw.wmnet
* 16:46 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3075.esams.wmnet [reason: trixie reimaging]
* 16:46 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2003.codfw.wmnet
* 16:45 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
* 16:44 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2013.codfw.wmnet
* 16:44 jayme@cumin1003: START - Cookbook sre.dns.netbox
* 16:43 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2009.codfw.wmnet
* 16:43 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1347
* 16:43 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
* 16:43 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1347.eqiad.wmnet with OS trixie
* 16:43 brett@cumin2002: START - Cookbook sre.hosts.reboot-cluster
* 16:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2007.codfw.wmnet with reason: kernel update
* 16:40 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2012.codfw.wmnet
* 16:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3079.esams.wmnet with OS trixie
* 16:39 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2008.codfw.wmnet
* 16:38 moritzm: installing PHP 8.2 security updates
* 16:37 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2009.codfw.wmnet
* 16:36 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
* 16:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3078.esams.wmnet with OS trixie
* 16:34 moritzm: installing alsa-lib security updates
* 16:33 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
* 16:32 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2008.codfw.wmnet
* 16:32 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns2006.wikimedia.org
* 16:29 moritzm: failover Ganeti master in eqiad to ganeti1046
* 16:29 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3076.esams.wmnet with reason: host reimage
* 16:29 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2003.codfw.wmnet
* 16:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
* 16:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
* 16:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2005.codfw.wmnet with reason: kernel update
* 16:24 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3076.esams.wmnet with reason: host reimage
* 16:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
* 16:22 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1012.eqiad.wmnet
* 16:20 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1013.eqiad.wmnet
* 16:19 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1011.eqiad.wmnet
* 16:18 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
* 16:18 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 16:16 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host install4004.wikimedia.org with OS bookworm
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
* 16:14 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns2006.wikimedia.org
* 16:14 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1013.eqiad.wmnet
* 16:14 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1009.eqiad.wmnet
* 16:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3079.esams.wmnet with reason: host reimage
* 16:13 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1011.eqiad.wmnet
* 16:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1029.eqiad.wmnet with reason: kernel update
* 16:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1016.eqiad.wmnet with OS bookworm
* 16:12 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 16:11 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 16:11 moritzm: powercycling ganeti1053 (stuck on reboot)
* 16:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 16:09 klausman@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad
* 16:09 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 16:08 klausman@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad
* 16:07 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1009.eqiad.wmnet
* 16:07 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1003.eqiad.wmnet
* 16:06 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3079.esams.wmnet with reason: host reimage
* 16:06 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 16:06 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 16:04 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 16:04 klausman@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad
* 16:04 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 16:02 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 16:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1028.eqiad.wmnet with reason: kernel update
* 16:00 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1003.eqiad.wmnet
* 16:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1015.eqiad.wmnet
* 16:00 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3075.esams.wmnet with OS trixie
* 16:00 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3076.esams.wmnet with OS trixie
* 15:59 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns2005.wikimedia.org
* 15:58 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3076.esams.wmnet [reason: trixie reimaging]
* 15:58 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1012.eqiad.wmnet
* 15:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
* 15:58 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 15:57 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 15:57 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1010.eqiad.wmnet
* 15:57 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1008.eqiad.wmnet
* 15:57 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3074.esams.wmnet [reason: trixie reimaging]
* 15:56 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 15:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
* 15:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
* 15:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1017.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1023.eqiad.wmnet with reason: kernel update
* 15:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1015.eqiad.wmnet
* 15:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy1022.eqiad.wmnet
* 15:54 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1008.eqiad.wmnet
* 15:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy1022.eqiad.wmnet
* 15:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
* 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
* 15:52 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1010.eqiad.wmnet
* 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
* 15:51 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1012.eqiad.wmnet
* 15:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3074.esams.wmnet with OS trixie
* 15:49 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and not P<nowiki>{</nowiki>cp2042.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 15:48 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1014.eqiad.wmnet
* 15:48 klausman@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on A:ml-serve-worker-eqiad
* 15:47 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy1003.eqiad.wmnet
* 15:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
* 15:46 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and not P<nowiki>{</nowiki>cp2041.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 15:45 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1017.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:45 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns2005.wikimedia.org
* 15:42 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1014.eqiad.wmnet
* 15:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
* 15:41 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3079.esams.wmnet with OS trixie
* 15:41 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3078.esams.wmnet with OS trixie
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
* 15:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1022.eqiad.wmnet with reason: kernel update
* 15:38 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Update for dse-k8s-worker1016 - btullis@cumin1003"
* 15:37 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Update for dse-k8s-worker1016 - btullis@cumin1003"
* 15:37 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy1003.eqiad.wmnet
* 15:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1027.eqiad.wmnet
* 15:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1016.eqiad.wmnet with reason: host reimage
* 15:35 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-eqiad
* 15:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
* 15:34 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3075.esams.wmnet with reason: host reimage
* 15:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1372.eqiad.wmnet
* 15:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1371.eqiad.wmnet
* 15:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1370.eqiad.wmnet
* 15:31 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1027.eqiad.wmnet
* 15:30 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns2004.wikimedia.org
* 15:29 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1016.eqiad.wmnet with reason: host reimage
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1369.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1368.eqiad.wmnet
* 15:28 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1372.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1367.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1366.eqiad.wmnet
* 15:28 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1371.eqiad.wmnet
* 15:28 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1370.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1365.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1364.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1363.eqiad.wmnet
* 15:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1362.eqiad.wmnet
* 15:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1361.eqiad.wmnet
* 15:27 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1017
* 15:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1360.eqiad.wmnet
* 15:26 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1017
* 15:25 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3074.esams.wmnet with reason: host reimage
* 15:25 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install4004.wikimedia.org with OS bookworm
* 15:25 sukhe@dns1004: END - running authdns-update
* 15:24 sukhe@dns1004: START - running authdns-update
* 15:24 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install4004.wikimedia.org
* 15:24 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host install4004.wikimedia.org with OS bookworm
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1369.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1368.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1367.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1366.eqiad.wmnet
* 15:23 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1365.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1364.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1363.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1362.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1361.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1360.eqiad.wmnet
* 15:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1349.eqiad.wmnet
* 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
* 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
* 15:18 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3075.esams.wmnet with reason: host reimage
* 15:18 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3074.esams.wmnet with reason: host reimage
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1348.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1346.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1344.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1345.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1343.eqiad.wmnet
* 15:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1342.eqiad.wmnet
* 15:16 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns2004.wikimedia.org
* 15:15 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1349.eqiad.wmnet
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1341.eqiad.wmnet
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1340.eqiad.wmnet
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1339.eqiad.wmnet
* 15:15 moritzm: imported jenkins 2.541.3 for bullseye/bookworm/trixie
* 15:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1338.eqiad.wmnet
* 15:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1016.eqiad.wmnet with OS bookworm
* 15:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1348.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1346.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1336.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1337.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1345.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1344.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1334.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1335.eqiad.wmnet
* 15:11 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1343.eqiad.wmnet
* 15:11 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1342.eqiad.wmnet
* 15:11 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1332.eqiad.wmnet
* 15:11 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1333.eqiad.wmnet
* 15:11 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 15:11 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 15:10 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1341.eqiad.wmnet
* 15:10 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1340.eqiad.wmnet
* 15:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1331.eqiad.wmnet
* 15:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1330.eqiad.wmnet
* 15:10 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1339.eqiad.wmnet
* 15:09 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1329.eqiad.wmnet
* 15:09 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1338.eqiad.wmnet
* 15:09 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1328.eqiad.wmnet
* 15:09 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
* 15:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
* 15:07 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1337.eqiad.wmnet
* 15:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
* 15:07 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1336.eqiad.wmnet
* 15:07 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1335.eqiad.wmnet
* 15:06 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1334.eqiad.wmnet
* 15:06 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1333.eqiad.wmnet
* 15:06 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1332.eqiad.wmnet
* 15:05 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1331.eqiad.wmnet
* 15:05 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1330.eqiad.wmnet
* 15:04 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1329.eqiad.wmnet
* 15:04 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1328.eqiad.wmnet
* 15:03 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 15:02 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1033.eqiad.wmnet with OS trixie
* 15:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
* 15:01 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns1006.wikimedia.org
* 14:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum4002.ulsfo.wmnet
* 14:58 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:57 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 14:54 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3075.esams.wmnet with OS trixie
* 14:54 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3075.esams.wmnet [reason: trixie reimaging]
* 14:54 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3074.esams.wmnet with OS trixie
* 14:53 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3074.esams.wmnet [reason: trixie reimaging]
* 14:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
* 14:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
* 14:50 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:49 slyngshede@dns1004: END - running authdns-update
* 14:48 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:48 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:48 slyngshede@dns1004: START - running authdns-update
* 14:47 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns1006.wikimedia.org
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum4002.ulsfo.wmnet
* 14:45 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-eqiad
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum4001.ulsfo.wmnet
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:44 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
* 14:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
* 14:40 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:38 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
* 14:36 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum4001.ulsfo.wmnet
* 14:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install4004.wikimedia.org with OS bookworm
* 14:33 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
* 14:32 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns1005.wikimedia.org
* 14:32 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "inline pattern and pattern equivalence - oblivian@cumin1003"
* 14:32 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: inline pattern and pattern equivalence - oblivian@cumin1003
* 14:31 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: inline pattern and pattern equivalence - oblivian@cumin1003
* 14:31 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "inline pattern and pattern equivalence - oblivian@cumin1003"
* 14:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
* 14:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install4004.wikimedia.org - jmm@cumin2002"
* 14:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install4004.wikimedia.org - jmm@cumin2002"
* 14:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install4004.wikimedia.org on all recursors
* 14:24 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install4004.wikimedia.org on all recursors
* 14:21 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install4004.wikimedia.org on all recursors
* 14:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install4004.wikimedia.org on all recursors
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install4004.wikimedia.org - jmm@cumin2002"
* 14:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install4004.wikimedia.org - jmm@cumin2002"
* 14:19 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns1005.wikimedia.org
* 14:17 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254911{{!}}Restore quotation-marks in ext.wikilambda.app messages (T420456)]] (duration: 06m 32s)
* 14:17 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:17 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS bookworm
* 14:17 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:16 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:16 apine@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install4004.wikimedia.org
* 14:15 apine@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:15 apine@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:14 apine@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:14 apine@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:14 jforrester@deploy2002: jforrester: Continuing with sync
* 14:13 apine@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:13 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:13 jforrester@deploy2002: jforrester: Backport for [[gerrit:1254911{{!}}Restore quotation-marks in ext.wikilambda.app messages (T420456)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:13 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:11 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1254911{{!}}Restore quotation-marks in ext.wikilambda.app messages (T420456)]]
* 14:08 apine@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 apine@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 apine@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:06 apine@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:05 XioNoX: set graceful-shutdown on EdgeUno transit sessions
* 14:05 apine@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:04 apine@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:04 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns1004.wikimedia.org
* 14:02 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
* 14:01 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 14:01 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 13:57 Msz2001: UTC afternoon backport+config window done
* 13:56 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254876{{!}}Tweak configuration of external link aggregate usage analysis (T419837)]] (duration: 06m 41s)
* 13:55 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:53 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 13:52 mszwarc@deploy2002: mszwarc: Continuing with sync
* 13:51 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1254876{{!}}Tweak configuration of external link aggregate usage analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:50 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns1004.wikimedia.org
* 13:50 sukhe@cumin1003: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox
* 13:49 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1254876{{!}}Tweak configuration of external link aggregate usage analysis (T419837)]]
* 13:49 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254916{{!}}Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917{{!}}Normalize external domain names in click analysis (T419837)]] (duration: 07m 23s)
* 13:46 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 13:45 mszwarc@deploy2002: mszwarc: Continuing with sync
* 13:43 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1254916{{!}}Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917{{!}}Normalize external domain names in click analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2096.codfw.wmnet
* 13:41 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1033.eqiad.wmnet with OS trixie
* 13:41 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS bookworm
* 13:41 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1254916{{!}}Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917{{!}}Normalize external domain names in click analysis (T419837)]]
* 13:40 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248095{{!}}filebackend: Remove outdated comment]], [[gerrit:1254216{{!}}GrowthExperiments: increase edit and thanks query limit II (T341599)]] (duration: 08m 47s)
* 13:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
* 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
* 13:39 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 13:39 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet
* 13:37 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2096.codfw.wmnet
* 13:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2095.codfw.wmnet
* 13:36 sgimeno@deploy2002: matmarex, sgimeno: Continuing with sync
* 13:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
* 13:33 sgimeno@deploy2002: matmarex, sgimeno: Backport for [[gerrit:1248095{{!}}filebackend: Remove outdated comment]], [[gerrit:1254216{{!}}GrowthExperiments: increase edit and thanks query limit II (T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:31 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet
* 13:31 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1095.eqiad.wmnet
* 13:31 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1248095{{!}}filebackend: Remove outdated comment]], [[gerrit:1254216{{!}}GrowthExperiments: increase edit and thanks query limit II (T341599)]]
* 13:30 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2095.codfw.wmnet
* 13:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2094.codfw.wmnet
* {{safesubst:SAL entry|1=13:28 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254894{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254895{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254891{{!}}Revert "SpecialPreferences: Use Language Select Widget in language field" (T419895)]], [[gerrit:1254890{{!}}Revert "SpecialPreferences: Use Language Select Widget in lan}}
* 13:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 13:28 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 13:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1026.eqiad.wmnet
* 13:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
* 13:26 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 13:25 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1095.eqiad.wmnet
* 13:25 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1094.eqiad.wmnet
* 13:24 sgimeno@deploy2002: somerandomdeveloper, sgimeno: Continuing with sync
* {{safesubst:SAL entry|1=13:24 sgimeno@deploy2002: somerandomdeveloper, sgimeno: Backport for [[gerrit:1254894{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254895{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254891{{!}}Revert "SpecialPreferences: Use Language Select Widget in language field" (T419895)]], [[gerrit:1254890{{!}}Revert "SpecialPreferences: Use Language Select Widget in}}
* 13:23 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2094.codfw.wmnet
* 13:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2093.codfw.wmnet
* {{safesubst:SAL entry|1=13:22 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1254894{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254895{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254891{{!}}Revert "SpecialPreferences: Use Language Select Widget in language field" (T419895)]], [[gerrit:1254890{{!}}Revert "SpecialPreferences: Use Language Select Widget in lang}}
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
* 13:21 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1026.eqiad.wmnet
* 13:20 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:16 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2093.codfw.wmnet
* 13:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2092.codfw.wmnet
* 13:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
* 13:16 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1094.eqiad.wmnet
* 13:16 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1093.eqiad.wmnet
* 13:15 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:15 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:15 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
* 13:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:10 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:09 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1093.eqiad.wmnet
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
* 13:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2092.codfw.wmnet
* 13:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1027.eqiad.wmnet with reason: host reimage
* 13:08 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1016
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:06 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1016
* 13:06 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:04 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1027.eqiad.wmnet with reason: host reimage
* 13:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
* 13:02 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1012.eqiad.wmnet
* 12:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet with OS bookworm
* 12:58 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 12:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
* 12:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1015.eqiad.wmnet
* 12:55 ayounsi@dns1004: END - running authdns-update
* 12:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1012.eqiad.wmnet
* 12:54 ayounsi@dns1004: START - running authdns-update
* 12:53 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1012.eqiad.wmnet
* 12:53 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 12:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1015.eqiad.wmnet
* 12:50 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 12:50 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
* 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
* 12:47 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-codfw
* 12:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
* 12:42 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-jumbo-eqiad
* 12:38 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:37 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:37 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:36 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 12:35 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1372].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 12:33 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:32 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:31 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:30 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1020.eqiad.wmnet with reason: host reimage
* 12:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
* 12:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
* 12:25 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1020.eqiad.wmnet with reason: host reimage
* 12:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
* 12:25 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 12:25 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 12:25 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:24 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Update for dse-k8s-worker1015 - btullis@cumin1003"
* 12:24 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Update for dse-k8s-worker1015 - btullis@cumin1003"
* 12:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 12:22 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:21 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
* 12:19 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:19 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host serpens.wikimedia.org
* 12:13 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254851{{!}}Enable autodemotion for 2FA-less CN admins and WMF T&S (T418580)]] (duration: 06m 21s)
* 12:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host serpens.wikimedia.org
* 12:10 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
* 12:10 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
* 12:09 mszwarc@deploy2002: mszwarc: Continuing with sync
* 12:09 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1254851{{!}}Enable autodemotion for 2FA-less CN admins and WMF T&S (T418580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:07 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1254851{{!}}Enable autodemotion for 2FA-less CN admins and WMF T&S (T418580)]]
* 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
* 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
* 12:05 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:05 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 12:04 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:03 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:02 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254883{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254884{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]], [[gerrit:1254882{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254881{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]] (duration: 06m 48s)
* 12:02 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1026.eqiad.wmnet with reason: host reimage
* 12:02 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
* 12:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
* 12:01 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
* 11:59 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1026.eqiad.wmnet with reason: host reimage
* 11:58 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 11:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1012.eqiad.wmnet
* 11:57 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1254883{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254884{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]], [[gerrit:1254882{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254881{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]] synced to the testservers (see https://wikitech.wikimedia.
* 11:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 11:56 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1372].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 11:56 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-codfw
* 11:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:55 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1254883{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254884{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]], [[gerrit:1254882{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254881{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]]
* 11:54 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:54 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
* 11:50 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:50 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Updating for dse-k8s-worker1012 - btullis@cumin1003"
* 11:49 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Updating for dse-k8s-worker1012 - btullis@cumin1003"
* 11:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1015.eqiad.wmnet with reason: host reimage
* 11:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
* 11:48 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
* 11:48 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1307.eqiad.wmnet
* 11:48 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1307.eqiad.wmnet
* 11:47 claime: sudo homer lsw1-e5-eqiad* commit 'wikikube-worker1307 to active'
* 11:47 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:46 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1015.eqiad.wmnet with reason: host reimage
* 11:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 11:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 11:44 cgoubert@cumin1003: START - Cookbook sre.dns.netbox
* 11:42 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 11:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
* 11:39 cgoubert@cumin1003: START - Cookbook sre.dns.netbox
* 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
* 11:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2091.codfw.wmnet
* 11:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
* 11:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
* 11:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
* 11:36 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1347.eqiad.wmnet
* 11:34 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1020.eqiad.wmnet with OS bookworm
* 11:31 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 11:31 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2091.codfw.wmnet
* 11:30 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1092.eqiad.wmnet
* 11:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2090.codfw.wmnet
* 11:30 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1347.eqiad.wmnet
* 11:30 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
* 11:30 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 11:30 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 11:29 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 11:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
* 11:29 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 11:28 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 11:28 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
* 11:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
* 11:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2090.codfw.wmnet
* 11:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2089.codfw.wmnet
* 11:23 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1092.eqiad.wmnet
* 11:23 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1091.eqiad.wmnet
* 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
* 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
* 11:20 btullis@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host dse-k8s-worker1015
* 11:20 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1015
* 11:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1091.eqiad.wmnet
* 11:18 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1090.eqiad.wmnet
* 11:18 vgutierrez@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 11:18 vgutierrez@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping2004.codfw.wmnet
* 11:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 11:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2088.codfw.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
* 11:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping2004.codfw.wmnet
* 11:13 vgutierrez@dns1004: END - running authdns-update
* 11:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
* 11:11 vgutierrez@dns1004: START - running authdns-update
* 11:11 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1090.eqiad.wmnet
* 11:11 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1089.eqiad.wmnet
* 11:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2088.codfw.wmnet
* 11:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2087.codfw.wmnet
* 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
* 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
* 11:07 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 11:05 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:05 btullis@cumin1003: END (PASS) - Cookbook sre.hadoop.reboot-workers (exit_code=0) for Hadoop test cluster
* 11:04 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1089.eqiad.wmnet
* 11:04 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:04 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1088.eqiad.wmnet
* 11:03 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2087.codfw.wmnet
* 11:03 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2086.codfw.wmnet
* 11:03 vgutierrez@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
* 11:00 vgutierrez@cumin1003: START - Cookbook sre.dns.netbox
* 10:59 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-jumbo-eqiad
* 10:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
* 10:57 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1088.eqiad.wmnet
* 10:57 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1087.eqiad.wmnet
* 10:57 fabfur@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
* 10:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2086.codfw.wmnet
* 10:56 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2085.codfw.wmnet
* 10:56 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 10:53 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1087.eqiad.wmnet
* 10:53 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1086.eqiad.wmnet
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 10:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
* 10:49 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2085.codfw.wmnet
* 10:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 10:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2084.codfw.wmnet
* 10:46 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1086.eqiad.wmnet
* 10:46 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1085.eqiad.wmnet
* 10:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 10:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
* 10:40 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2084.codfw.wmnet
* 10:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2083.codfw.wmnet
* 10:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 10:39 fabfur@cumin1003: START - Cookbook sre.dns.netbox
* 10:39 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1085.eqiad.wmnet
* 10:39 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1084.eqiad.wmnet
* 10:37 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2001.codfw.wmnet
* 10:34 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:34 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 10:32 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1084.eqiad.wmnet
* 10:32 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1083.eqiad.wmnet
* 10:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2083.codfw.wmnet
* 10:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
* 10:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2082.codfw.wmnet
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 10:32 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
* 10:31 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2002.codfw.wmnet
* 10:26 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-staging2002.codfw.wmnet
* 10:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 10:25 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1083.eqiad.wmnet
* 10:25 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1082.eqiad.wmnet
* 10:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2082.codfw.wmnet
* 10:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2081.codfw.wmnet
* 10:24 btullis@cumin1003: START - Cookbook sre.hadoop.reboot-workers for Hadoop test cluster
* 10:23 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2003.codfw.wmnet
* 10:23 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
* 10:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 10:19 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-staging2003.codfw.wmnet
* 10:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1082.eqiad.wmnet
* 10:17 vgutierrez@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: no reason specified, no task ID specified]
* 10:17 vgutierrez@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: no reason specified, no task ID specified]
* 10:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2081.codfw.wmnet
* 10:17 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2001.codfw.wmnet
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 10:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
* 10:14 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2001.codfw.wmnet
* 10:14 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2002.codfw.wmnet
* 10:13 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1081.eqiad.wmnet
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 10:11 vgutierrez@dns1004: END - running authdns-update
* 10:10 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2002.codfw.wmnet
* 10:10 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2001.codfw.wmnet
* 10:09 vgutierrez@dns1004: START - running authdns-update
* 10:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2080.codfw.wmnet
* 10:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 10:06 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1081.eqiad.wmnet
* 10:06 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1080.eqiad.wmnet
* 10:05 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2001.codfw.wmnet
* 10:05 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2002.codfw.wmnet
* 10:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 10:04 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 10:03 slyngshede@cumin1003: END (FAIL) - Cookbook sre.dns.admin (exit_code=99) DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 10:03 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 10:01 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2002.codfw.wmnet
* 10:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 23 hosts
* 10:01 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
* 10:01 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1002.eqiad.wmnet
* 10:01 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for 23 hosts
* 09:59 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1080.eqiad.wmnet
* 09:59 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1079.eqiad.wmnet
* 09:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2080.codfw.wmnet
* 09:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2079.codfw.wmnet
* 09:58 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1002.eqiad.wmnet
* 09:57 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:56 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
* 09:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 09:52 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:51 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2079.codfw.wmnet
* 09:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2078.codfw.wmnet
* 09:51 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1079.eqiad.wmnet
* 09:51 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1003.eqiad.wmnet
* 09:51 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1078.eqiad.wmnet
* 09:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
* 09:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 09:48 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1003.eqiad.wmnet
* 09:48 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1002.eqiad.wmnet
* 09:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
* 09:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
* 09:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 09:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1026.eqiad.wmnet
* 09:46 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1002.eqiad.wmnet
* 09:46 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1001.eqiad.wmnet
* 09:45 moritzm: installing postgresql-15 security updates
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart A:lvs-secondary-ulsfo and A:liberica
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) pooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2078.codfw.wmnet
* 09:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2077.codfw.wmnet
* 09:45 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin pooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
* 09:44 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.upgrade restart A:lvs-secondary-ulsfo and A:liberica
* 09:44 jayme: switched wikikube staging apiservers to IPIP and maglev in eqiad and codfw - [[phab:T352956|T352956]]
* 09:43 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1078.eqiad.wmnet
* 09:43 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1077.eqiad.wmnet
* 09:42 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1001.eqiad.wmnet
* 09:40 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-staging-master-eqiad@eqiad
* 09:40 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:40 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading A:lvs-secondary-ulsfo and A:liberica ([[phab:T418971|T418971]])
* 09:40 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading A:lvs-secondary-ulsfo and A:liberica ([[phab:T418971|T418971]])
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1026.eqiad.wmnet
* 09:39 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2077.codfw.wmnet
* 09:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2076.codfw.wmnet
* 09:37 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-codfw
* 09:37 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-staging-master-eqiad@eqiad
* 09:36 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1077.eqiad.wmnet
* 09:36 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1076.eqiad.wmnet
* 09:35 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-staging-master-codfw@codfw
* 09:35 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 09:30 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2076.codfw.wmnet
* 09:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2075.codfw.wmnet
* 09:26 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1076.eqiad.wmnet
* 09:26 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1075.eqiad.wmnet
* 09:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 09:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1025.eqiad.wmnet
* 09:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2075.codfw.wmnet
* 09:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2074.codfw.wmnet
* 09:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 09:19 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1075.eqiad.wmnet
* 09:18 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1074.eqiad.wmnet
* 09:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 09:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1025.eqiad.wmnet
* 09:15 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2074.codfw.wmnet
* 09:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2073.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-staging-master-codfw@codfw
* 09:14 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 09:13 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-codfw
* 09:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 09:12 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-eqiad
* 09:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
* 09:10 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1074.eqiad.wmnet
* 09:10 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1073.eqiad.wmnet
* 09:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2073.codfw.wmnet
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet
* 09:08 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 23 hosts with reason: Update ULSFO LVS service IPs
* 09:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 09:03 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1073.eqiad.wmnet
* 09:03 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1072.eqiad.wmnet
* 09:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet
* 09:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2071.codfw.wmnet
* 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 09:02 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping1004.eqiad.wmnet
* 09:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 08:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
* 08:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 08:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping1004.eqiad.wmnet
* 08:56 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1072.eqiad.wmnet
* 08:56 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet
* 08:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2071.codfw.wmnet
* 08:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2070.codfw.wmnet
* 08:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 08:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
* 08:48 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet
* 08:47 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2070.codfw.wmnet
* 08:46 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-eqiad
* 08:29 hashar: Restarting CI Jenkins for plugin upgrade # [[phab:T420347|T420347]]
* 08:22 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 07:45 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 32934
* 07:42 btullis@cumin1003: END (PASS) - Cookbook sre.hadoop.reboot-workers (exit_code=0) for Hadoop analytics cluster
* 07:35 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 32934
* 07:22 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
* 07:16 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
* 06:54 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
* 06:38 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
* 03:22 musikanimal@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254468{{!}}CM5: add more aggressive warnings about CM5 deprecation (T373720)]] (duration: 12m 22s)
* 03:18 musikanimal@deploy2002: musikanimal: Continuing with sync
* 03:11 musikanimal@deploy2002: musikanimal: Backport for [[gerrit:1254468{{!}}CM5: add more aggressive warnings about CM5 deprecation (T373720)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 03:09 musikanimal@deploy2002: Started scap sync-world: Backport for [[gerrit:1254468{{!}}CM5: add more aggressive warnings about CM5 deprecation (T373720)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 47s)
* 02:07 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 02:06 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:04 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:38 denisse@deploy2002: Finished deploy [librenms/librenms@9bdfb73]: Upgrade LibreNMS to 26.3.1 (duration: 00m 19s)
* 01:38 denisse@deploy2002: Started deploy [librenms/librenms@9bdfb73]: Upgrade LibreNMS to 26.3.1
* 01:10 denisse@deploy2002: Finished deploy [librenms/librenms@d152b36]: Upgrade LibreNMS to 25.11.0 (duration: 00m 08s)
* 01:10 denisse@deploy2002: Started deploy [librenms/librenms@d152b36]: Upgrade LibreNMS to 25.11.0
== 2026-03-17 ==
* 23:44 btullis@cumin1003: START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster
* 23:38 btullis@cumin1003: END (FAIL) - Cookbook sre.hadoop.reboot-workers (exit_code=99) for Hadoop analytics cluster
* 22:55 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3081.*
* 22:20 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3073.esams.wmnet [reason: trixie reimaging]
* 22:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3073.esams.wmnet with OS trixie
* 22:11 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3072.esams.wmnet [reason: trixie reimaging]
* 22:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3072.esams.wmnet with OS trixie
* 22:05 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on releases1003.eqiad.wmnet with reason: [[phab:T420246|T420246]]
* 22:05 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T420246|T420246]]
* 21:48 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3073.esams.wmnet with reason: host reimage
* 21:44 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3072.esams.wmnet with reason: host reimage
* 21:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3073.esams.wmnet with reason: host reimage
* 21:39 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3072.esams.wmnet with reason: host reimage
* 21:38 ryankemper: [[phab:T411568|T411568]] Failed back HDFS NameNode from an-master1004 to an-master1003; cluster back to original active/standby configuration
* 21:15 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3073.esams.wmnet with OS trixie
* 21:14 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3073.esams.wmnet [reason: trixie reimaging]
* 21:14 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3072.esams.wmnet with OS trixie
* 21:14 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3072.esams.wmnet [reason: trixie reimaging]
* 21:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3070.esams.wmnet [reason: trixie reimaging]
* 21:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3071.esams.wmnet [reason: trixie reimaging]
* 21:09 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3070.esams.wmnet with OS trixie
* 21:05 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3071.esams.wmnet with OS trixie
* 20:59 alexsanford@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254280{{!}}Remove notice from login form in popup mode (T418534)]] (duration: 07m 32s)
* 20:56 alexsanford@deploy2002: alexsanford: Continuing with sync
* 20:54 alexsanford@deploy2002: alexsanford: Backport for [[gerrit:1254280{{!}}Remove notice from login form in popup mode (T418534)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 alexsanford@deploy2002: Started scap sync-world: Backport for [[gerrit:1254280{{!}}Remove notice from login form in popup mode (T418534)]]
* 20:48 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:43 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:43 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3070.esams.wmnet with reason: host reimage
* 20:40 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 20:40 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 20:38 ryankemper: [[phab:T411568|T411568]] failed over HDFS NameNode from an-master1003 to an-master1004, then rebooted `an-master1003`
* 20:38 ryankemper: [[phab:T411568|T411568]] rebooted `an-coord1003`, `an-coord1004`, `an-tool1007`, `an-tool1008`, `an-tool1011`, `an-web1001`
* 20:38 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3071.esams.wmnet with reason: host reimage
* 20:34 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3070.esams.wmnet with reason: host reimage
* 20:34 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3071.esams.wmnet with reason: host reimage
* 20:31 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254301{{!}}Passwordless login: Don't display conditional auth errors]], [[gerrit:1254302{{!}}Passwordless login: Don't display conditional auth errors]] (duration: 08m 56s)
* 20:27 catrope@deploy2002: catrope: Continuing with sync
* 20:24 catrope@deploy2002: catrope: Backport for [[gerrit:1254301{{!}}Passwordless login: Don't display conditional auth errors]], [[gerrit:1254302{{!}}Passwordless login: Don't display conditional auth errors]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:22 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1254301{{!}}Passwordless login: Don't display conditional auth errors]], [[gerrit:1254302{{!}}Passwordless login: Don't display conditional auth errors]]
* 20:16 ryankemper: [[phab:T411568|T411568]] rebooted `an-test-master1002`, `an-test-master1003`, `an-test-master1004`, `archiva1002`
* 20:12 aude@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251309{{!}}Set wgReadingListsBetaDefaultForNewAccountsAfter for beta cluster (T419163)]] (duration: 08m 53s)
* 20:09 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3071.esams.wmnet with OS trixie
* 20:09 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3070.esams.wmnet with OS trixie
* 20:08 aude@deploy2002: aude: Continuing with sync
* 20:08 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3070.esams.wmnet [reason: trixie reimaging]
* 20:08 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3068.esams.wmnet [reason: trixie reimaging]
* 20:07 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3069.esams.wmnet [reason: trixie reimaging]
* 20:06 aude@deploy2002: aude: Backport for [[gerrit:1251309{{!}}Set wgReadingListsBetaDefaultForNewAccountsAfter for beta cluster (T419163)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:03 aude@deploy2002: Started scap sync-world: Backport for [[gerrit:1251309{{!}}Set wgReadingListsBetaDefaultForNewAccountsAfter for beta cluster (T419163)]]
* 19:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3081.esams.wmnet with OS trixie
* 19:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3069.esams.wmnet with OS trixie
* 19:54 ryankemper: [[phab:T411568|T411568]] rebooted `an-test-client1002`, `an-test-ui1001`, `an-test-coord1001`, `an-test-master1001`
* 19:50 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3068.esams.wmnet with OS trixie
* 19:46 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 19:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast2003.wikimedia.org with OS trixie
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3081.esams.wmnet with reason: host reimage
* 19:28 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3081.esams.wmnet with reason: host reimage
* 19:28 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases1003.eqiad.wmnet with reason: [[phab:T420246|T420246]]
* 19:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3069.esams.wmnet with reason: host reimage
* 19:23 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3068.esams.wmnet with reason: host reimage
* 19:21 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3069.esams.wmnet with reason: host reimage
* 19:20 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3068.esams.wmnet with reason: host reimage
* 19:19 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 19:11 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 19:08 dzahn@dns1004: END - running authdns-update
* 19:07 dzahn@dns1004: START - running authdns-update
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3081.esams.wmnet with OS trixie
* 19:05 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp3081.esams.wmnet with OS trixie
* 19:00 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3080.*
* 18:56 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3069.esams.wmnet with OS trixie
* 18:55 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3069.esams.wmnet [reason: trixie reimaging]
* 18:55 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3068.esams.wmnet with OS trixie
* 18:55 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid public cluster: Reboot Druid nodes
* 18:54 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3068.esams.wmnet [reason: trixie reimaging]
* 18:53 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3066.esams.wmnet [reason: trixie reimaging]
* 18:53 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3067.esams.wmnet [reason: trixie reimaging]
* 18:50 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host bast2003.wikimedia.org with OS trixie
* 18:49 swfrench-wmf: manually uncordoned wikikube-worker-exp1001.eqiad.wmnet after failed reboot
* 18:47 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3080.esams.wmnet with OS trixie
* 18:43 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3067.esams.wmnet with OS trixie
* 18:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3066.esams.wmnet with OS trixie
* 18:32 dwisehaupt@dns1005: END - running authdns-update
* 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast2003.wikimedia.org with OS bookworm
* 18:31 dwisehaupt@dns1005: START - running authdns-update
* 18:25 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 18:25 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 18:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3080.esams.wmnet with reason: host reimage
* 18:19 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[5-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:19 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7008.magru.wmnet
* 18:17 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:16 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3067.esams.wmnet with reason: host reimage
* 18:16 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 18:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3066.esams.wmnet with reason: host reimage
* 18:09 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3080.esams.wmnet with reason: host reimage
* 18:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 18:04 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3067.esams.wmnet with reason: host reimage
* 18:03 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3066.esams.wmnet with reason: host reimage
* 18:03 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 17:52 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1312-1327].eqiad.wmnet,wikikube-worker-exp1001.eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 17:52 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 17:52 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 17:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3080.esams.wmnet with OS trixie
* 17:44 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:43 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:42 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3081.esams.wmnet with OS trixie
* 17:42 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp3081.esams.wmnet with OS trixie
* 17:42 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:41 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid public cluster: Reboot Druid nodes
* 17:40 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:39 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7013.magru.wmnet,cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 17:39 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7016.magru.wmnet
* 17:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7007.magru.wmnet
* 17:37 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:33 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:31 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:29 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3067.esams.wmnet with OS trixie
* 17:29 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3067.esams.wmnet [reason: trixie reimaging]
* 17:28 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3066.esams.wmnet with OS trixie
* 17:28 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:27 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp3066.esams.wmnet with OS trixie
* 17:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3066.esams.wmnet with OS trixie
* 17:26 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:26 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3066.esams.wmnet [reason: trixie reimaging]
* 17:21 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:20 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:19 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3081.esams.wmnet with OS trixie
* 17:19 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:16 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet
* 17:16 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2069.codfw.wmnet
* 17:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 17:14 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:14 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:14 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:13 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:13 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:09 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7014.*
* 17:09 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet
* 17:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2069.codfw.wmnet
* 17:09 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:08 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1070.eqiad.wmnet
* 17:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet
* 17:07 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host bast2003.wikimedia.org with OS bookworm
* 17:06 cgoubert@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1312-1327].eqiad.wmnet,wikikube-worker-exp1001.eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 17:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['bast2003']
* 17:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2068.codfw.wmnet
* 17:02 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1070.eqiad.wmnet
* 17:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2067.codfw.wmnet
* 17:01 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1069.eqiad.wmnet
* 17:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 17:00 cgoubert@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7014.magru.wmnet with OS trixie
* 16:58 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 16:58 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 16:58 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 16:57 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7015.magru.wmnet
* 16:56 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7006.magru.wmnet
* 16:55 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1069.eqiad.wmnet
* 16:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2067.codfw.wmnet
* 16:53 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1068.eqiad.wmnet
* 16:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2066.codfw.wmnet
* 16:47 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:47 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist all cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 16:46 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 16:46 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['bast2003']
* 16:45 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1068.eqiad.wmnet
* 16:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2066.codfw.wmnet
* 16:44 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1067.eqiad.wmnet
* 16:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2065.codfw.wmnet
* 16:42 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid analytics cluster: Reboot Druid nodes
* 16:40 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 16:37 cgoubert@cumin1003: START - Cookbook sre.dns.netbox
* 16:36 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1067.eqiad.wmnet
* 16:36 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2065.codfw.wmnet
* 16:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2064.codfw.wmnet
* 16:35 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1066.eqiad.wmnet
* 16:34 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist group2 cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 16:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2003.codfw.wmnet with OS trixie
* 16:33 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7014.magru.wmnet with reason: host reimage
* 16:32 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:32 cgoubert@cumin1003: START - Cookbook sre.hosts.remove-downtime for wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:28 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7014.magru.wmnet with reason: host reimage
* 16:28 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1066.eqiad.wmnet
* 16:28 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2064.codfw.wmnet
* 16:25 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on releases2003.codfw.wmnet with reason: [[phab:T420246|T420246]]
* 16:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2063.codfw.wmnet
* 16:25 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1065.eqiad.wmnet
* 16:25 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:25 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1065.eqiad.wmnet
* 16:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2063.codfw.wmnet
* 16:18 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-jumbo-eqiad
* 16:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2062.codfw.wmnet
* 16:17 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1064.eqiad.wmnet
* 16:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7013.magru.wmnet
* 16:15 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2003.codfw.wmnet with reason: host reimage
* 16:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7005.magru.wmnet
* 16:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2062.codfw.wmnet
* 16:08 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2003.codfw.wmnet with reason: host reimage
* 16:07 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1064.eqiad.wmnet
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7014.magru.wmnet with OS trixie
* 16:05 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp7014.magru.wmnet with OS trixie
* 16:03 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7013.magru.wmnet,cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 16:03 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[5-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 15:54 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1009.eqiad.wmnet
* 15:54 mutante: zuul2003 - reimaging with trixie
* 15:52 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist group1 cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2033.codfw.wmnet
* 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2033.codfw.wmnet
* 15:46 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2003.codfw.wmnet with OS trixie
* 15:45 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1009.eqiad.wmnet
* 15:45 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1008.eqiad.wmnet
* 15:44 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist group0 cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 15:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2033.codfw.wmnet
* 15:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 15:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2048.codfw.wmnet
* 15:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2048.codfw.wmnet
* 15:36 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1008.eqiad.wmnet
* 15:36 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1007.eqiad.wmnet
* 15:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2048.codfw.wmnet
* 15:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1012.eqiad.wmnet with reason: host reimage
* 15:33 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist testwikis cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 15:32 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid analytics cluster: Reboot Druid nodes
* 15:28 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1007.eqiad.wmnet
* 15:28 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1006.eqiad.wmnet
* 15:27 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1012.eqiad.wmnet with reason: host reimage
* 15:27 samtar@deploy2002: mwscript-k8s job started: cleanupWatchlistLabelMember.php --wiki=testwiki # [[phab:T420328|T420328]]
* 15:27 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2008-dev.codfw.wmnet
* 15:25 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 15:23 jmm@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 15:22 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad
* 15:21 jmm@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 15:20 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2008-dev.codfw.wmnet
* 15:20 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1006.eqiad.wmnet
* 15:20 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1005.eqiad.wmnet
* 15:18 jmm@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 15:18 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244723{{!}}cleanup: Growth: Remove temporary GrowthMentorList overrides (T418518)]] (duration: 06m 32s)
* 15:16 jmm@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 15:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 16509
* 15:14 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2005-dev.codfw.wmnet
* 15:14 urbanecm@deploy2002: urbanecm: Continuing with sync
* 15:13 urbanecm@deploy2002: urbanecm: Backport for [[gerrit:1244723{{!}}cleanup: Growth: Remove temporary GrowthMentorList overrides (T418518)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:13 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 15:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2048.codfw.wmnet
* 15:11 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1244723{{!}}cleanup: Growth: Remove temporary GrowthMentorList overrides (T418518)]]
* 15:10 brennen@deploy2002: Finished deploy [phabricator/deployment@e845707]: deploy phab1004 for [[phab:T420366|T420366]] (duration: 01m 02s)
* 15:09 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1005.eqiad.wmnet
* 15:09 brennen@deploy2002: Started deploy [phabricator/deployment@e845707]: deploy phab1004 for [[phab:T420366|T420366]]
* 15:09 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254217{{!}}Create dblists for wikis where CheckUser and AbuseFilter are disabled (T420063 T420062)]] (duration: 06m 38s)
* 15:09 brennen@deploy2002: Finished deploy [phabricator/deployment@e845707]: deploy phab2002 for [[phab:T420366|T420366]] (duration: 00m 35s)
* 15:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudservices2005-dev.codfw.wmnet
* 15:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 15:08 brennen@deploy2002: Started deploy [phabricator/deployment@e845707]: deploy phab2002 for [[phab:T420366|T420366]]
* 15:08 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2009.codfw.wmnet
* 15:05 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 15:05 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 15:05 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 15:04 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1254217{{!}}Create dblists for wikis where CheckUser and AbuseFilter are disabled (T420063 T420062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:04 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7014.magru.wmnet with OS trixie
* 15:03 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host phab1004.eqiad.wmnet
* 15:02 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1254217{{!}}Create dblists for wikis where CheckUser and AbuseFilter are disabled (T420063 T420062)]]
* 15:02 topranks: reset BGP session to ssw1-d8-eiqad from lsw1-d4-eqiad [[phab:T420180|T420180]]
* 15:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:02 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 15:02 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2003-dev.codfw.wmnet
* 15:00 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2009.codfw.wmnet
* 15:00 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2008.codfw.wmnet
* 14:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
* 14:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
* 14:57 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host phab1004.eqiad.wmnet
* 14:55 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2003-dev.codfw.wmnet
* 14:55 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2002-dev.codfw.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum4004.ulsfo.wmnet
* 14:53 jmm@deploy2002: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 14:53 jmm@deploy2002: helmfile [staging] START helmfile.d/services/thumbor: apply
* 14:52 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2008.codfw.wmnet
* 14:52 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2007.codfw.wmnet
* 14:51 topranks: stop accepting routes on ssw1-d8-eqiad from external peers (cr2-eqiad, other spines) [[phab:T420351|T420351]]
* 14:51 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 14:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host durum4004.ulsfo.wmnet
* 14:50 topranks: stop announcing routes from ssw1-d8-eqiad to external peers (cr2-eqiad, other spines) [[phab:T420351|T420351]]
* 14:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
* 14:48 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2002-dev.codfw.wmnet
* 14:48 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2001-dev.codfw.wmnet
* 14:46 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2007.codfw.wmnet
* 14:46 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2006.codfw.wmnet
* 14:45 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 14:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:44 taavi: deploying cr firewall changes from https://gerrit.wikimedia.org/r/c/operations/homer/public/+/1254211
* 14:44 topranks: stop announcing "direct" routes to ssw1-d8-eqiad from cr2-eqiad [[phab:T420351|T420351]]
* 14:44 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2034.codfw.wmnet
* 14:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:43 moritzm: failover Ganeti master in codfw to ganeti2047
* 14:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2050.codfw.wmnet
* 14:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2050.codfw.wmnet
* 14:41 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2001-dev.codfw.wmnet
* 14:41 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2007-dev.codfw.wmnet
* 14:40 topranks: disabling EVPN IBGP peering from ssw1-d8-eqiad to ssw1-d1-eqiad to stop them reflecting routes [[phab:T420351|T420351]]
* 14:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1006.eqiad.wmnet
* 14:39 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 14:38 inflatador: bking@requestctl remove `wdqs_highest_error_rate_ever_seen` requestctl rule as it is no longer needed
* 14:38 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2006.codfw.wmnet
* 14:37 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2005.codfw.wmnet
* 14:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2050.codfw.wmnet
* 14:35 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2007-dev.codfw.wmnet
* 14:35 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2006-dev.codfw.wmnet
* 14:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1006.eqiad.wmnet
* 14:34 Daimona: Creating ce_event_goals DB table for the CampaignEvents extension in x1.testwiki, x1.test2wiki, x1.officewiki, and x1.wikishared # [[phab:T411433|T411433]]
* 14:34 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2050.codfw.wmnet
* 14:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2049.codfw.wmnet
* 14:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2049.codfw.wmnet
* 14:31 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 14:30 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2005.codfw.wmnet
* 14:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2049.codfw.wmnet
* 14:27 topranks: de-pref internet circuits landing on cr2-eqiad to shift traffic to cr1 [[phab:T420351|T420351]]
* 14:27 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2006-dev.codfw.wmnet
* 14:27 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2005-dev.codfw.wmnet
* 14:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2049.codfw.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2047.codfw.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2047.codfw.wmnet
* 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-test-presto1001.eqiad.wmnet
* 14:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-test-presto1001.eqiad.wmnet
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2047.codfw.wmnet
* 14:19 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2005-dev.codfw.wmnet
* 14:19 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb2004-dev.codfw.wmnet
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 14:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2047.codfw.wmnet
* 14:14 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2046.codfw.wmnet
* 14:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2046.codfw.wmnet
* 14:13 topranks: disable VRRP on cr2-eqiad interfaces facing ssw1-d8-eqiad [[phab:T420351|T420351]]
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 14:11 moritzm: powercycling ganeti2046 (stuck on reboot)
* 14:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
* 14:10 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudlb2004-dev.codfw.wmnet
* 14:10 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb2003-dev.codfw.wmnet
* 14:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 14:05 topranks: setting cr1-eqiad as VRRP master for all vlans [[phab:T420351|T420351]]
* 14:01 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudlb2003-dev.codfw.wmnet
* 14:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2046.codfw.wmnet
* 13:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6002.drmrs.wmnet
* 13:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6002.drmrs.wmnet
* 13:57 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 13:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2046.codfw.wmnet
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2045.codfw.wmnet
* 13:52 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudlb2002-dev.codfw.wmnet
* 13:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6002.drmrs.wmnet
* 13:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2045.codfw.wmnet
* 13:45 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254189{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]], [[gerrit:1254190{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]] (duration: 08m 10s)
* 13:44 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet
* 13:42 esanders@deploy2002: esanders: Continuing with sync
* 13:39 esanders@deploy2002: esanders: Backport for [[gerrit:1254189{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]], [[gerrit:1254190{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:38 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
* 13:38 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2004.codfw.wmnet
* 13:37 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1254189{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]], [[gerrit:1254190{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]]
* 13:35 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on logstash2023.codfw.wmnet with reason: ganeti reboot
* 13:33 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 13:32 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host apus-be2004.codfw.wmnet
* 13:32 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2045.codfw.wmnet
* 13:32 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2002.codfw.wmnet
* 13:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6002.drmrs.wmnet
* 13:30 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251610{{!}}Turn on postprocessing cache for all Parsoid parses (T348255)]] (duration: 10m 31s)
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 13:26 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be2002.codfw.wmnet
* 13:26 cscott@deploy2002: cscott: Continuing with sync
* 13:26 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2001.codfw.wmnet
* 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6001.drmrs.wmnet
* 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6001.drmrs.wmnet
* 13:22 cscott@deploy2002: cscott: Backport for [[gerrit:1251610{{!}}Turn on postprocessing cache for all Parsoid parses (T348255)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2044.codfw.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2044.codfw.wmnet
* 13:20 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be2001.codfw.wmnet
* 13:20 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker13[00-47].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 13:20 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1251610{{!}}Turn on postprocessing cache for all Parsoid parses (T348255)]]
* 13:20 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:19 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:19 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2280-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 13:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6001.drmrs.wmnet
* 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6001.drmrs.wmnet
* 13:16 btullis@cumin1003: END (PASS) - Cookbook sre.presto.reboot-workers (exit_code=0) for Presto an-presto cluster: Reboot Presto nodes
* 13:15 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2044.codfw.wmnet
* 13:15 aklapper@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254166{{!}}Remove misplaced readonly from CategoryViewer::$query (T420315)]] (duration: 06m 31s)
* 13:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint1001.eqiad.wmnet
* 13:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint1001.eqiad.wmnet
* 13:11 aklapper@deploy2002: zabe, aklapper: Continuing with sync
* 13:11 aklapper@deploy2002: zabe, aklapper: Backport for [[gerrit:1254166{{!}}Remove misplaced readonly from CategoryViewer::$query (T420315)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:10 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:10 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet
* 13:10 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 16509
* 13:09 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1004.eqiad.wmnet
* 13:09 aklapper@deploy2002: Started scap sync-world: Backport for [[gerrit:1254166{{!}}Remove misplaced readonly from CategoryViewer::$query (T420315)]]
* 13:08 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint2001.codfw.wmnet
* 13:04 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host apus-be1004.eqiad.wmnet
* 13:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint2001.codfw.wmnet
* 13:02 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2044.codfw.wmnet
* 13:02 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1002.eqiad.wmnet
* 13:01 moritzm: failover Ganeti masters in drmrs to ganeti6003/6004
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6004.drmrs.wmnet
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6004.drmrs.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2043.codfw.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2043.codfw.wmnet
* 12:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 214657
* 12:56 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 214657
* 12:56 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 56308
* 12:55 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 12:55 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 56308
* 12:55 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 28788
* 12:55 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be1002.eqiad.wmnet
* 12:55 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1001.eqiad.wmnet
* 12:54 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 28788
* 12:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6004.drmrs.wmnet
* 12:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2043.codfw.wmnet
* 12:53 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 28788
* 12:53 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 28788
* 12:53 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9269
* 12:52 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1012
* 12:52 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:51 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 9269
* 12:51 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1012
* 12:51 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-e8-eqiad
* 12:51 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-e8-eqiad
* 12:50 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:48 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1015
* 12:48 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be1001.eqiad.wmnet
* 12:45 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1015
* 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6004.drmrs.wmnet
* 12:44 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:44 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2043.codfw.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6003.drmrs.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6003.drmrs.wmnet
* 12:40 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 12:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2042.codfw.wmnet
* 12:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2042.codfw.wmnet
* 12:38 moritzm: powercycling ganeti2042 (stuck on reboot)
* 12:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6003.drmrs.wmnet
* 12:34 moritzm: powercycling ganeti2041 (stuck on reboot)
* 12:31 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6003.drmrs.wmnet
* 12:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install1005.wikimedia.org
* 12:22 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2042.codfw.wmnet
* 12:20 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-cluster
* 12:20 Emperor: roll-reboot apus frontends (codfw) for March reboots
* 12:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install1005.wikimedia.org
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install2005.wikimedia.org
* 12:13 topranks: restart BGP announcements from ssw1-d1-eqiad following change [[phab:T420180|T420180]]
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2042.codfw.wmnet
* 12:08 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2280-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 12:07 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install2005.wikimedia.org
* 12:06 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 12:06 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
* 12:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org
* 12:06 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
* 12:05 jayme@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=(registry1005.eqiad.wmnet{{!}}registry2005.codfw.wmnet)
* 12:05 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
* 12:05 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
* 12:04 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 12:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org
* 12:04 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install4003.wikimedia.org
* 12:03 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c7-eqiad [[phab:T420180|T420180]]
* 12:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2041.codfw.wmnet
* 12:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2041.codfw.wmnet
* 12:01 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
* 12:01 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
* 12:00 jayme@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=(registry1005.eqiad.wmnet{{!}}registry2005.codfw.wmnet)
* 12:00 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c6-eqiad [[phab:T420180|T420180]]
* 12:00 jayme@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=(registry1004.eqiad.wmnet{{!}}registry2004.codfw.wmnet)
* 11:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
* 11:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
* 11:59 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c4-eqiad [[phab:T420180|T420180]]
* 11:58 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c3-eqiad [[phab:T420180|T420180]]
* 11:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install4003.wikimedia.org
* 11:56 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c2-eqiad [[phab:T420180|T420180]]
* 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install5003.wikimedia.org
* 11:55 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
* 11:55 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
* 11:54 jayme@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=(registry1004.eqiad.wmnet{{!}}registry2004.codfw.wmnet)
* 11:54 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-d3-eqiad [[phab:T420180|T420180]]
* 11:53 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-d1-eqiad [[phab:T420180|T420180]]
* 11:52 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 11:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5007.eqsin.wmnet
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5007.eqsin.wmnet
* 11:49 btullis@cumin1003: START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster
* 11:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install5003.wikimedia.org
* 11:48 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org
* 11:47 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 11:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2041.codfw.wmnet
* 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5007.eqsin.wmnet
* 11:43 btullis@cumin1003: START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes
* 11:41 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet
* 11:41 cgoubert@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker13[00-47].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 11:39 topranks: stop accepting external routes on ssw1-d1-eqiad from cr1-eqiad [[phab:T420180|T420180]]
* 11:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org
* 11:33 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-cluster
* 11:33 Emperor: roll-reboot apus frontends (eqiad) for March reboots
* 11:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org
* 11:28 moritzm: failover Ganeti master in eqsin to ganeti5004
* 11:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2041.codfw.wmnet
* 11:24 topranks: reduce local-preference for BGP routes learnt from servers on cr1-eqiad [[phab:T420180|T420180]]
* 11:22 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:18 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2040.codfw.wmnet
* 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2040.codfw.wmnet
* 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2040.codfw.wmnet
* 11:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
* 11:05 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:04 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2040.codfw.wmnet
* 11:01 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5005.eqsin.wmnet
* 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5005.eqsin.wmnet
* 11:00 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:59 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:58 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:58 topranks: prepend external BGP announcements from cr1-eqiad [[phab:T420180|T420180]]
* 10:57 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:56 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:56 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2039.codfw.wmnet
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2039.codfw.wmnet
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5005.eqsin.wmnet
* 10:52 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 10:51 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:49 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:49 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:49 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5005.eqsin.wmnet
* 10:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2039.codfw.wmnet
* 10:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 10:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 10:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2039.codfw.wmnet
* 10:45 javiermonton@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 10:45 javiermonton@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2038.codfw.wmnet
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2038.codfw.wmnet
* 10:43 javiermonton@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 10:43 javiermonton@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
* 10:42 topranks: cease announcing routed networks from ssw1-d1-eqiad to cr1-eqiad in BGP [[phab:T420180|T420180]]
* 10:41 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:41 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:40 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:39 javiermonton@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 10:39 javiermonton@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 10:38 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2038.codfw.wmnet
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 10:37 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:33 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw2004-dev.codfw.wmnet
* 10:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2038.codfw.wmnet
* 10:29 topranks: stop announcing directly connected routes to L3 switches from cr1-eqiad [[phab:T420180|T420180]]
* 10:28 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:27 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudgw2004-dev.codfw.wmnet
* 10:27 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw2003-dev.codfw.wmnet
* 10:26 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:25 topranks: disable EVPN IBGP peering between ssw1-d1-eqiad and ssw1-d8-eqiad [[phab:T420180|T420180]]
* 10:24 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:21 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:21 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:20 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudgw2003-dev.codfw.wmnet
* 10:20 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:19 urbanecm: Delete `job/growthexperiments-listtaskcounts-29513771` from mw-cron (job stuck for more than a month)
* 10:19 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1003.eqiad.wmnet
* 10:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 10:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2037.codfw.wmnet
* 10:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2037.codfw.wmnet
* 10:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1003.eqiad.wmnet
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
* 10:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2037.codfw.wmnet
* 10:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
* 10:06 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2002.codfw.wmnet
* 10:05 topranks: disabling VRRP for et-1/0/5 sub-interfaces on cr1-eqiad [[phab:T420180|T420180]]
* 10:03 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:03 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:02 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:01 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:01 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:01 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:00 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2002.codfw.wmnet
* 10:00 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2001.codfw.wmnet
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:58 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2003.codfw.wmnet
* 09:57 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:57 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 09:57 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 09:56 topranks: shift traffic from codfw to eqiad off Arelion CCT to Lumen
* 09:56 mvernon@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling reboot on A:thanos-fe
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2003.codfw.wmnet
* 09:54 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2001.codfw.wmnet
* 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 09:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2037.codfw.wmnet
* 09:53 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:52 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:50 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:47 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 09:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 09:42 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 09:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2002.codfw.wmnet
* 09:38 moritzm: installing openssl bugfix updates on trixie hosts
* 09:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2002.codfw.wmnet
* 09:31 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2001.codfw.wmnet
* 09:25 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2001.codfw.wmnet
* 09:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
* 09:21 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 09:20 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
* 09:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 09:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 09:10 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254114{{!}}Revert "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] (duration: 12m 36s)
* 09:06 topranks: increase VRRP priority on eqiad vlans on CR2 to shift active gateway to cr2-eqiad [[phab:T420180|T420180]]
* 09:05 mvernon@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling reboot on A:thanos-fe
* 09:03 kharlan@deploy2002: kharlan: Continuing with sync
* 09:02 kharlan@deploy2002: kharlan: Backport for [[gerrit:1254114{{!}}Revert "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:58 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-canary
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2036.codfw.wmnet
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2036.codfw.wmnet
* 08:57 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1254114{{!}}Revert "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]]
* 08:57 moritzm: rebuilt the trixie d-i image for the 13.4 point release [[phab:T420240|T420240]]
* 08:54 kharlan@deploy2002: Sync cancelled.
* 08:52 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-canary
* 08:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2036.codfw.wmnet
* 08:49 kharlan@deploy2002: harroyo-wmf, kharlan: Backport for [[gerrit:1250575{{!}}hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend (T419125)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2036.codfw.wmnet
* 08:44 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host bast2003.wikimedia.org
* 08:43 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1250575{{!}}hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend (T419125)]]
* 08:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:42 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:35 arnaudb@cumin1003: END (PASS) - Cookbook sre.gerrit.restart-gerrit (exit_code=0) Restarting Gerrit on gerrit2002
* 08:34 arnaudb@cumin1003: START - Cookbook sre.gerrit.restart-gerrit Restarting Gerrit on gerrit2002
* 08:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 08:34 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host contint1002.wikimedia.org
* 08:32 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:28 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 08:27 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host contint1002.wikimedia.org
* 08:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2035.codfw.wmnet
* 08:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2035.codfw.wmnet
* 08:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host parsoidtest1001.eqiad.wmnet
* 08:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2035.codfw.wmnet
* 08:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti3005.esams.wmnet to cluster esams03 and group B
* 08:14 moritzm: powercycling bast2003 (stuck on reboot)
* 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti3005.esams.wmnet to cluster esams03 and group B
* 08:14 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host parsoidtest1001.eqiad.wmnet
* 08:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3005.esams.wmnet
* 08:09 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:08 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet
* 07:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
* 07:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5004.wikimedia.org
* 07:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti3005.esams.wmnet with OS bookworm
* 07:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5004.wikimedia.org
* 07:37 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 07:37 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 07:34 arnaudb@cumin1003: END (PASS) - Cookbook sre.gerrit.restart-gerrit (exit_code=0) Restarting Gerrit on gerrit2003
* 07:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2035.codfw.wmnet
* 07:32 arnaudb@cumin1003: START - Cookbook sre.gerrit.restart-gerrit Restarting Gerrit on gerrit2003
* 07:32 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2033.codfw.wmnet
* 07:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 07:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2032.codfw.wmnet
* 07:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2032.codfw.wmnet
* 07:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti3005.esams.wmnet with reason: host reimage
* 07:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti3005.esams.wmnet with reason: host reimage
* 07:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2032.codfw.wmnet
* 07:23 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2032.codfw.wmnet
* 07:00 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti3005.esams.wmnet with OS bookworm
* 06:08 kart_: Updated cxserver to 2026-03-16-071247-production ([[phab:T420004|T420004]])
* 06:07 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 06:06 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 06:05 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 06:04 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 05:58 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 05:58 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 04:41 dwisehaupt@dns1005: END - running authdns-update
* 04:39 dwisehaupt@dns1005: START - running authdns-update
* 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.17 (duration: 01m 17s)
* 03:43 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.20 refs [[phab:T413811|T413811]] (duration: 39m 34s)
* 03:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 10s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 00:26 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6009.*
* 00:25 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6009.drmrs.wmnet with OS trixie
* 00:07 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251158{{!}}Enable languages in main menu on Russian Wikipedia (T419730)]] (duration: 06m 57s)
* 00:03 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 00:02 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1251158{{!}}Enable languages in main menu on Russian Wikipedia (T419730)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1251158{{!}}Enable languages in main menu on Russian Wikipedia (T419730)]]
== 2026-03-16 ==
* 23:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6009.drmrs.wmnet with reason: host reimage
* 23:56 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253604{{!}}Don't output language HTML when no languages present (T419730)]], [[gerrit:1251157{{!}}Support duplication of languages in header and main menu (T419730)]] (duration: 06m 44s)
* 23:56 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6009.drmrs.wmnet with reason: host reimage
* 23:52 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 23:51 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1253604{{!}}Don't output language HTML when no languages present (T419730)]], [[gerrit:1251157{{!}}Support duplication of languages in header and main menu (T419730)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:50 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1253604{{!}}Don't output language HTML when no languages present (T419730)]], [[gerrit:1251157{{!}}Support duplication of languages in header and main menu (T419730)]]
* 23:36 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6009.drmrs.wmnet with OS trixie
* 23:32 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp601(0{{!}}1).*
* 22:54 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6008.drmrs.wmnet [reason: trixie reimaging]
* 22:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6008.drmrs.wmnet with OS trixie
* 22:37 jforrester@deploy2002: Finished scap sync-world: [[phab:T411807|T411807]] (duration: 11m 10s)
* 22:35 jforrester@deploy2002: jforrester: Continuing with sync
* 22:32 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6010.drmrs.wmnet with OS trixie
* 22:31 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp70[09-12].magru.wmnet<nowiki>}</nowiki> and A:cp
* 22:31 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7012.magru.wmnet
* 22:31 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 22:30 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[1-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 22:30 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7004.magru.wmnet
* 22:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6011.drmrs.wmnet with OS trixie
* 22:28 jforrester@deploy2002: jforrester: [[phab:T411807|T411807]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 jforrester@deploy2002: Started scap sync-world: [[phab:T411807|T411807]]
* 22:27 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 22:24 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 22:20 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 22:17 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1020-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 22:07 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6010.drmrs.wmnet with reason: host reimage
* 22:05 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6007.drmrs.wmnet [reason: trixie reimaging]
* 22:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6011.drmrs.wmnet with reason: host reimage
* 22:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6007.drmrs.wmnet with OS trixie
* 22:02 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6008.drmrs.wmnet with OS trixie
* 21:59 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6010.drmrs.wmnet with reason: host reimage
* 21:58 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6011.drmrs.wmnet with reason: host reimage
* 21:58 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp6008.drmrs.wmnet with OS trixie
* 21:52 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7011.magru.wmnet
* 21:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7003.magru.wmnet
* 21:42 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul1003.eqiad.wmnet with OS trixie
* 21:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 21:40 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6010.drmrs.wmnet with OS trixie
* 21:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6012.*
* 21:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6012.drmrs.wmnet with OS trixie
* 21:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6011.drmrs.wmnet with OS trixie
* 21:38 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6007.drmrs.wmnet with reason: host reimage
* 21:36 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6013.*
* 21:36 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 21:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6013.drmrs.wmnet with OS trixie
* 21:32 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6007.drmrs.wmnet with reason: host reimage
* 21:28 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul1003.eqiad.wmnet with reason: host reimage
* 21:22 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul1003.eqiad.wmnet with reason: host reimage
* 21:19 Dreamy_Jazz: Evening UTC backport window done
* 21:18 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251848{{!}}Disable CheckUser on closed wikis where no checks were ever made (T420062)]], [[gerrit:1251865{{!}}Uninstall SecurePoll from closed wikis (T420062)]], [[gerrit:1251888{{!}}DiscussionTools: Uninstall wikis closed before permalinks were deployed (T420052)]] (duration: 06m 10s)
* 21:17 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6008.drmrs.wmnet with OS trixie
* 21:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6008.drmrs.wmnet [reason: trixie reimaging]
* 21:15 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6006.drmrs.wmnet [reason: trixie reimaging]
* 21:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6006.drmrs.wmnet with OS trixie
* 21:14 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 21:14 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1251848{{!}}Disable CheckUser on closed wikis where no checks were ever made (T420062)]], [[gerrit:1251865{{!}}Uninstall SecurePoll from closed wikis (T420062)]], [[gerrit:1251888{{!}}DiscussionTools: Uninstall wikis closed before permalinks were deployed (T420052)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified the
* 21:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6012.drmrs.wmnet with reason: host reimage
* 21:12 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6007.drmrs.wmnet with OS trixie
* 21:12 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1251848{{!}}Disable CheckUser on closed wikis where no checks were ever made (T420062)]], [[gerrit:1251865{{!}}Uninstall SecurePoll from closed wikis (T420062)]], [[gerrit:1251888{{!}}DiscussionTools: Uninstall wikis closed before permalinks were deployed (T420052)]]
* 21:12 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6007.drmrs.wmnet [reason: trixie reimaging]
* 21:11 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6005.drmrs.wmnet [reason: trixie reimaging]
* 21:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6005.drmrs.wmnet with OS trixie
* 21:10 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7010.magru.wmnet
* 21:10 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7002.magru.wmnet
* 21:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6013.drmrs.wmnet with reason: host reimage
* 21:08 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul1003.eqiad.wmnet with OS trixie
* 21:07 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6012.drmrs.wmnet with reason: host reimage
* 21:06 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6013.drmrs.wmnet with reason: host reimage
* 21:05 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253623{{!}}Fix client credentials access tokens (T417278 T419921)]], [[gerrit:1253625{{!}}Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (T414338)]], [[gerrit:1253626{{!}}Configure $wgApiClientErrorSampleRate (T418957)]] (duration: 08m 06s)
* 21:01 catrope@deploy2002: matmarex, catrope: Continuing with sync
* 20:59 catrope@deploy2002: matmarex, catrope: Backport for [[gerrit:1253623{{!}}Fix client credentials access tokens (T417278 T419921)]], [[gerrit:1253625{{!}}Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (T414338)]], [[gerrit:1253626{{!}}Configure $wgApiClientErrorSampleRate (T418957)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:57 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1253623{{!}}Fix client credentials access tokens (T417278 T419921)]], [[gerrit:1253625{{!}}Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (T414338)]], [[gerrit:1253626{{!}}Configure $wgApiClientErrorSampleRate (T418957)]]
* 20:54 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:54 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cp[2027-2040].codfw.wmnet
* 20:50 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2027-2040].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:50 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2027-2040].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage
* 20:48 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6012.drmrs.wmnet with OS trixie
* 20:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6013.drmrs.wmnet with OS trixie
* 20:45 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2042.codfw.wmnet with reason: Testing hosts - not for production
* 20:45 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6005.drmrs.wmnet with reason: host reimage
* 20:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2042.codfw.wmnet with OS trixie
* 20:44 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephmon2007-dev.codfw.wmnet with OS bookworm
* 20:44 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 20:44 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253566{{!}}Configure external link aggregate usage on 12 wikis for top domains (T419837)]] (duration: 06m 59s)
* 20:43 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage
* 20:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2041.codfw.wmnet with reason: Testing hosts - not for production
* 20:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6005.drmrs.wmnet with reason: host reimage
* 20:40 kharlan@deploy2002: kharlan, mszwarc: Continuing with sync
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2041.codfw.wmnet with OS trixie
* 20:39 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 20:38 kharlan@deploy2002: kharlan, mszwarc: Backport for [[gerrit:1253566{{!}}Configure external link aggregate usage on 12 wikis for top domains (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:37 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1253566{{!}}Configure external link aggregate usage on 12 wikis for top domains (T419837)]]
* 20:34 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6014.*
* 20:33 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:33 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:32 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253551{{!}}Fix double post-processing in legacy preview case (T419908)]] (duration: 06m 52s)
* 20:29 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7009.magru.wmnet
* 20:28 cscott@deploy2002: cscott: Continuing with sync
* 20:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7001.magru.wmnet
* 20:27 cscott@deploy2002: cscott: Backport for [[gerrit:1253551{{!}}Fix double post-processing in legacy preview case (T419908)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:26 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1253551{{!}}Fix double post-processing in legacy preview case (T419908)]]
* 20:22 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 20:22 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6006.drmrs.wmnet with OS trixie
* 20:21 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6006.drmrs.wmnet [reason: trixie reimaging]
* 20:21 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6004.drmrs.wmnet [reason: trixie reimaging]
* 20:21 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6005.drmrs.wmnet with OS trixie
* 20:20 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6004.drmrs.wmnet with OS trixie
* 20:20 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6005.drmrs.wmnet [reason: trixie reimaging]
* 20:19 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6003.drmrs.wmnet [reason: trixie reimaging]
* 20:19 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephmon2007-dev.codfw.wmnet with reason: host reimage
* 20:19 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp70[09-12].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:18 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[1-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:17 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248665{{!}}Enable passwordless login in production (T419198)]], [[gerrit:1253572{{!}}Instrument clicks on external links to selected domains (T419837)]] (duration: 06m 43s)
* 20:16 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:15 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 20:15 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephmon2007-dev.codfw.wmnet with reason: host reimage
* 20:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6003.drmrs.wmnet with OS trixie
* 20:13 catrope@deploy2002: kharlan, catrope: Continuing with sync
* 20:12 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:12 catrope@deploy2002: kharlan, catrope: Backport for [[gerrit:1248665{{!}}Enable passwordless login in production (T419198)]], [[gerrit:1253572{{!}}Instrument clicks on external links to selected domains (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:12 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 20:11 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 20:10 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1248665{{!}}Enable passwordless login in production (T419198)]], [[gerrit:1253572{{!}}Instrument clicks on external links to selected domains (T419837)]]
* 20:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6014.drmrs.wmnet with OS trixie
* 20:03 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[2027-2040].codfw.wmnet
* 20:01 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251589{{!}}Uninstall GlobalBlocking from closed wikis (T420062)]] (duration: 08m 20s)
* 19:57 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 19:55 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6004.drmrs.wmnet with reason: host reimage
* 19:55 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephmon2007-dev.codfw.wmnet with OS bookworm
* 19:54 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1251589{{!}}Uninstall GlobalBlocking from closed wikis (T420062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:54 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp2042.codfw.wmnet with OS trixie
* 19:53 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephmon2007-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:53 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS trixie
* 19:52 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1251589{{!}}Uninstall GlobalBlocking from closed wikis (T420062)]]
* 19:52 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host cloudcephmon2007-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:51 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251582{{!}}Uninstall AbuseFilter from closed wikis with no AbuseFilter logs (T420063)]] (duration: 09m 26s)
* 19:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6003.drmrs.wmnet with reason: host reimage
* 19:47 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 19:47 mutante: releases2003 - rm rsync-srv-org-wikimedia-releases-releases2003.* - alerts flapping since server reboot - puppet code needs to be improved to ensure units are removed when primary server is switched ([[phab:T420246|T420246]])
* 19:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6004.drmrs.wmnet with reason: host reimage
* 19:46 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6003.drmrs.wmnet with reason: host reimage
* 19:44 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1251582{{!}}Uninstall AbuseFilter from closed wikis with no AbuseFilter logs (T420063)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6014.drmrs.wmnet with reason: host reimage
* 19:42 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1251582{{!}}Uninstall AbuseFilter from closed wikis with no AbuseFilter logs (T420063)]]
* 19:41 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudcephmon2007-dev
* 19:41 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudcephmon2007-dev
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating cloudcephmon2007-dev in codfw - jhancock@cumin2002"
* 19:40 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6014.drmrs.wmnet with reason: host reimage
* 19:39 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253622{{!}}Revert "Media: Use previous step for non-standard width between steps and original" (T419927)]] (duration: 07m 10s)
* 19:35 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 19:34 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1253622{{!}}Revert "Media: Use previous step for non-standard width between steps and original" (T419927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:32 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating cloudcephmon2007-dev in codfw - jhancock@cumin2002"
* 19:32 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253622{{!}}Revert "Media: Use previous step for non-standard width between steps and original" (T419927)]]
* 19:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp404[5-6].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 19:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet
* 19:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6004.drmrs.wmnet with OS trixie
* 19:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 19:27 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6004.drmrs.wmnet [reason: trixie reimaging]
* 19:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6003.drmrs.wmnet with OS trixie
* 19:26 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6003.drmrs.wmnet [reason: trixie reimaging]
* 19:25 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6002.drmrs.wmnet [reason: trixie reimaging]
* 19:25 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6001.drmrs.wmnet [reason: trixie reimaging]
* 19:21 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6014.drmrs.wmnet with OS trixie
* 19:17 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6002.drmrs.wmnet with OS trixie
* 19:17 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2042.codfw.wmnet with reason: Testing hosts - not for production
* 19:16 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2041.codfw.wmnet with reason: Testing hosts - not for production
* 19:15 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:15 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:12 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6001.drmrs.wmnet with OS trixie
* 19:02 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:02 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 18:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp4046.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:57 cdobbins@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet
* 18:52 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6002.drmrs.wmnet with reason: host reimage
* 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4045.ulsfo.wmnet
* 18:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6001.drmrs.wmnet with reason: host reimage
* 18:47 cdobbins@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp4046.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6002.drmrs.wmnet with reason: host reimage
* 18:45 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6001.drmrs.wmnet with reason: host reimage
* 18:39 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp404[5-6].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:38 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6015.*
* 18:38 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp403[7-9].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4039.ulsfo.wmnet
* 18:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6015.drmrs.wmnet with OS trixie
* 18:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6002.drmrs.wmnet with OS trixie
* 18:26 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6002.drmrs.wmnet [reason: trixie reimaging]
* 18:26 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6001.drmrs.wmnet with OS trixie
* 18:24 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6001.drmrs.wmnet [reason: trixie reimaging]
* 18:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage
* 17:59 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage
* 17:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4038.ulsfo.wmnet
* 17:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6015.drmrs.wmnet with OS trixie
* 17:37 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6016.*
* 17:32 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2042.codfw.wmnet with OS trixie
* 17:18 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4037.ulsfo.wmnet
* 17:08 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 17:06 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp403[7-9].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 17:03 fabfur@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 17:02 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6016.drmrs.wmnet with OS trixie
* 16:57 mutante: contint2002 - rebooting
* 16:47 mutante: phab2002 - rebooting
* 16:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:44 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253520{{!}}Add APCOND_OATH_HAS2FA to UserRequirementsPrivateConditions]] (duration: 06m 15s)
* 16:42 mutante: rebooting backends of releases.wikimedia.org
* 16:42 fabfur@cumin1003: START - Cookbook sre.hosts.reimage for host cp2042.codfw.wmnet with OS trixie
* 16:41 fabfur: reimage cp2042 for HAProxy testing ([[phab:T419825|T419825]])
* 16:41 mszwarc@deploy2002: mszwarc: Continuing with sync
* 16:40 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1253520{{!}}Add APCOND_OATH_HAS2FA to UserRequirementsPrivateConditions]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:39 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2041.codfw.wmnet with OS trixie
* 16:38 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1253520{{!}}Add APCOND_OATH_HAS2FA to UserRequirementsPrivateConditions]]
* 16:37 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1020-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 16:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6016.drmrs.wmnet with reason: host reimage
* 16:32 milimetric: my bad, accidentally merged https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1250249, will read docs on config deployment better
* 16:31 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1012
* 16:27 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1012
* 16:27 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6016.drmrs.wmnet with reason: host reimage
* 16:20 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]] (duration: 07m 28s)
* 16:17 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
* 16:16 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 16:14 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:13 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2005.codfw.wmnet
* 16:12 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]]
* 16:12 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
* 16:11 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
* 16:11 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=codfw
* 16:11 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 16:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
* 16:09 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1024.eqiad.wmnet
* 16:09 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1024.eqiad.wmnet
* 16:09 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1024.eqiad.wmnet
* 16:07 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1004-1007,1011-1012,1015-1016,1019-1021,1029-1031,1034-1168,1240-1289,1291-1327].eqiad.wmnet
* 16:06 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1004-1007,1011-1012,1015-1016,1019-1021,1029-1031,1034-1168,1240-1289,1291-1327].eqiad.wmnet
* 16:06 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6016.drmrs.wmnet with OS trixie
* 16:06 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2005.codfw.wmnet
* 16:06 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
* 16:05 dwisehaupt@dns1006: END - running authdns-update
* 16:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
* 16:05 fabfur@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 16:04 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=codfw
* 16:04 dwisehaupt@dns1006: START - running authdns-update
* 16:04 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=eqiad
* 16:00 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1004-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 15:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
* 15:59 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2031.codfw.wmnet
* 15:59 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti2031.codfw.wmnet
* 15:54 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2004.codfw.wmnet
* 15:53 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
* 15:52 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=eqiad
* 15:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
* 15:47 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2004.codfw.wmnet
* 15:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2031.codfw.wmnet
* 15:47 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
* 15:46 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1024.eqiad.wmnet with reason: Rebooting clouddb1024 [[phab:T419960|T419960]]
* 15:44 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1024.eqiad.wmnet
* 15:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2031.codfw.wmnet
* 15:43 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1023.eqiad.wmnet
* 15:43 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1023.eqiad.wmnet
* 15:43 fabfur@cumin1003: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS trixie
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet
* 15:42 fabfur: reimage cp2041 for HAProxy testing ([[phab:T419825|T419825]])
* 15:42 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:41 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:41 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2003.codfw.wmnet
* 15:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:38 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
* 15:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:37 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Rebooting clouddb1023 [[phab:T419960|T419960]]
* 15:35 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1022.eqiad.wmnet
* 15:35 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1022.eqiad.wmnet
* 15:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet
* 15:32 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2003.codfw.wmnet
* 15:32 dwisehaupt@dns1006: END - running authdns-update
* 15:32 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2002.codfw.wmnet
* 15:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 15:31 dwisehaupt@dns1006: START - running authdns-update
* 15:27 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe-codfw
* 15:26 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
* 15:26 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2029.codfw.wmnet
* 15:26 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti2029.codfw.wmnet
* 15:26 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:26 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:24 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:24 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:22 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2002.codfw.wmnet
* 15:21 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Rebooting clouddb1023 [[phab:T419960|T419960]]
* 15:20 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2001.codfw.wmnet
* 15:20 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]]
* 15:16 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Rebooting clouddb1022 [[phab:T419960|T419960]]
* 15:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet
* 15:11 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling reboot on A:durum and A:durum
* 15:09 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet
* 15:04 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2001.codfw.wmnet
* 15:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:02 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling reboot on A:wikidough
* 15:01 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw1004.eqiad.wmnet
* 15:01 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 15:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2028.codfw.wmnet
* 15:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2028.codfw.wmnet
* 14:56 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:55 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:55 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:54 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]]
* 14:53 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudgw1004.eqiad.wmnet
* 14:53 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-codfw
* 14:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2028.codfw.wmnet
* 14:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2028.codfw.wmnet
* 14:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:50 mvernon@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe-eqiad
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw1003.eqiad.wmnet with OS trixie
* 14:31 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:30 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:26 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:22 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1002-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1002-1003].eqiad.wmnet
* 14:22 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1002-1003].eqiad.wmnet
* 14:21 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1002-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:21 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1002-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:20 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:20 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:20 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:18 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253461{{!}}fix(anon warning): remove wring type=signup param (T415160)]], [[gerrit:1253450{{!}}AccountCreation: track account registrations for WE1.8 experiments (T416100)]] (duration: 09m 16s)
* 14:17 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:17 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:14 sgimeno@deploy2002: sgimeno: Continuing with sync
* 14:13 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:13 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:11 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:11 sgimeno@deploy2002: sgimeno: Backport for [[gerrit:1253461{{!}}fix(anon warning): remove wring type=signup param (T415160)]], [[gerrit:1253450{{!}}AccountCreation: track account registrations for WE1.8 experiments (T416100)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:11 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:10 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1003.eqiad.wmnet with reason: host reimage
* 14:09 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1253461{{!}}fix(anon warning): remove wring type=signup param (T415160)]], [[gerrit:1253450{{!}}AccountCreation: track account registrations for WE1.8 experiments (T416100)]]
* 14:08 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2026.codfw.wmnet
* 14:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 14:08 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:04 arnaudb@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gerrit2002.wikimedia.org with reason: testing
* 14:03 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1003.eqiad.wmnet with reason: host reimage
* 14:02 arnaudb@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 2:00:00 on gerrit2002.wikimedia.org with reason: [[phab:T418256|T418256]]
* 14:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1003.eqiad.wmnet
* 13:58 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 13:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1003.eqiad.wmnet
* 13:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet
* 13:45 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253046{{!}}bowiki: update logos (T419268)]] (duration: 06m 17s)
* 13:45 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudgw1003.eqiad.wmnet with OS trixie
* 13:43 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-codfw
* 13:43 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 13:41 mszwarc@deploy2002: mszwarc, anzx: Continuing with sync
* 13:41 mszwarc@deploy2002: mszwarc, anzx: Backport for [[gerrit:1253046{{!}}bowiki: update logos (T419268)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet
* 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2004.codfw.wmnet
* 13:39 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1253046{{!}}bowiki: update logos (T419268)]]
* 13:38 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253423{{!}}Always use external actor for interwiki rights logs on target wiki (T6055)]] (duration: 08m 53s)
* 13:34 mszwarc@deploy2002: mszwarc: Continuing with sync
* 13:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2004.codfw.wmnet
* 13:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2026.codfw.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2025.codfw.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet
* 13:31 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1253423{{!}}Always use external actor for interwiki rights logs on target wiki (T6055)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 13:29 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1253423{{!}}Always use external actor for interwiki rights logs on target wiki (T6055)]]
* 13:28 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 13:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet
* 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3004.esams.wmnet
* 13:25 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling reboot on A:wikidough
* 13:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 13:22 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling reboot on A:durum and A:durum
* 13:21 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2025.codfw.wmnet
* 13:21 XioNoX: drain edgeuno transit for optic replacement - [[phab:T415743|T415743]]
* 13:19 cgoubert@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host wikikube-ctrl1004.eqiad.wmnet
* 13:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3004.esams.wmnet
* 13:14 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251487{{!}}Replace direct BagOStuff with WANObjectCache (T419666)]] (duration: 11m 25s)
* 13:11 bwojtowicz@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet
* 13:09 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti3005.esams.wmnet
* 13:09 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ganeti3005.esams.wmnet
* 13:07 jforrester@deploy2002: jforrester: Continuing with sync
* 13:06 jforrester@deploy2002: jforrester: Backport for [[gerrit:1251487{{!}}Replace direct BagOStuff with WANObjectCache (T419666)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl1004.eqiad.wmnet
* 13:04 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ncredir4002.ulsfo.wmnet
* 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 13:03 jiji@cumin1003: END (ERROR) - Cookbook sre.memcached.roll-reboot-restart (exit_code=97) rolling reboot on A:memcached-gutter-eqiad
* 13:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 13:03 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1251487{{!}}Replace direct BagOStuff with WANObjectCache (T419666)]]
* 13:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5002.eqsin.wmnet
* 12:51 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl1003.eqiad.wmnet
* 12:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow5002.eqsin.wmnet
* 12:48 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet
* 12:44 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3005.esams.wmnet
* 12:42 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl1003.eqiad.wmnet
* 12:41 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl1002.eqiad.wmnet
* 12:40 aikochou@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet
* 12:37 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ncredir4002.ulsfo.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ncredir4001.ulsfo.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:35 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:34 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow7002.magru.wmnet
* 12:32 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:32 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:32 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:28 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1017
* 12:27 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow7002.magru.wmnet
* 12:27 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1017
* 12:25 aikochou@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:25 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl1002.eqiad.wmnet
* 12:22 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:20 moritzm: failover Ganeti master in esams to ganeti3008
* 12:20 moritzm: failover Ganeti master in esams to ganeti3005
* 12:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:14 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:10 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ncredir4001.ulsfo.wmnet
* 12:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti3006.esams.wmnet
* 12:00 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti3006.esams.wmnet
* 11:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for druid[1009-1013].eqiad.wmnet
* 11:57 btullis@cumin1003: START - Cookbook sre.hosts.remove-downtime for druid[1009-1013].eqiad.wmnet
* 11:57 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.remove-downtime (exit_code=97) for druid[1009-1013].eqiad.wmnet
* 11:57 btullis@cumin1003: START - Cookbook sre.hosts.remove-downtime for druid[1009-1013].eqiad.wmnet
* 11:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1009.eqiad.wmnet with OS bookworm
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3006.esams.wmnet
* 11:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1010.eqiad.wmnet with OS bookworm
* 11:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1011.eqiad.wmnet with OS bookworm
* 11:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1012.eqiad.wmnet with OS bookworm
* 11:30 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3006.esams.wmnet
* 11:29 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1013.eqiad.wmnet with OS bookworm
* 11:24 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:24 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:22 btullis@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on dse-k8s-worker[1012,1015-1017].eqiad.wmnet with reason: Adding 10 Gbps NIC
* 11:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1009.eqiad.wmnet with reason: host reimage
* 11:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1010.eqiad.wmnet with reason: host reimage
* 11:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:14 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:12 mvernon@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe-eqiad
* 11:12 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe-codfw
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1011.eqiad.wmnet with reason: host reimage
* 11:10 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:10 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1012.eqiad.wmnet with reason: host reimage
* 11:07 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit2003.wikimedia.org
* 11:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1013.eqiad.wmnet with reason: host reimage
* 11:06 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 11:06 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 11:04 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1010.eqiad.wmnet with reason: host reimage
* 11:02 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1011.eqiad.wmnet with reason: host reimage
* 11:02 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1009.eqiad.wmnet with reason: host reimage
* 11:01 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1012.eqiad.wmnet with reason: host reimage
* 11:00 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit2003.wikimedia.org
* 10:57 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1013.eqiad.wmnet with reason: host reimage
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2010.codfw.wmnet
* 10:47 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1013.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1012.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1011.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1010.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1009.eqiad.wmnet with OS bookworm
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3007.esams.wmnet
* 10:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2010.codfw.wmnet
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3007.esams.wmnet
* 10:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3007.esams.wmnet
* 10:29 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3007.esams.wmnet
* 10:28 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:28 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3008.esams.wmnet
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3008.esams.wmnet
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2009.codfw.wmnet
* 10:24 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:24 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:23 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2009.codfw.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3008.esams.wmnet
* 10:14 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3008.esams.wmnet
* 10:09 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:08 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2004.codfw.wmnet
* 10:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7001.magru.wmnet
* 10:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet
* 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2004.codfw.wmnet
* 09:56 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet
* 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts tcp-proxy4002.ulsfo.wmnet
* 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 09:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 09:51 elukey@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 09:51 elukey@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 09:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7001.magru.wmnet
* 09:51 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:46 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts tcp-proxy4002.ulsfo.wmnet
* 09:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: decom tcp-proxy4001 - jmm@cumin2002"
* 09:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: decom tcp-proxy4001 - jmm@cumin2002"
* 09:43 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm2001.wikimedia.org
* 09:39 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idm2001.wikimedia.org
* 09:38 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:38 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 09:38 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 09:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:35 slyngshede@dns1004: END - running authdns-update
* 09:34 elukey@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 09:34 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:34 elukey@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 09:33 slyngshede@dns1004: START - running authdns-update
* 09:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:30 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm1001.wikimedia.org
* 09:26 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 09:26 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idm1001.wikimedia.org
* 09:24 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm-test1001.wikimedia.org
* 09:22 moritzm: failover Ganeti master in magru to ganeti7004
* 09:21 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts tcp-proxy4001.ulsfo.wmnet
* 09:21 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-test-eqiad
* 09:20 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idm-test1001.wikimedia.org
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2002.codfw.wmnet
* 09:18 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:15 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM cloudidp2001-dev.codfw.wmnet
* 09:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2002.codfw.wmnet
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7002.magru.wmnet
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet
* 09:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2001.codfw.wmnet
* 09:13 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts tcp-proxy4001.ulsfo.wmnet
* 09:11 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM cloudidp2001-dev.codfw.wmnet
* 09:09 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp2005.wikimedia.org
* 09:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2001.codfw.wmnet
* 09:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet
* 09:05 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp2005.wikimedia.org
* 09:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7002.magru.wmnet
* 09:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet
* 08:59 slyngshede@dns1004: END - running authdns-update
* 08:58 slyngshede@dns1004: START - running authdns-update
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7003.magru.wmnet
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet
* 08:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet
* 08:49 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp1005.wikimedia.org
* 08:48 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:48 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 08:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet
* 08:48 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 08:47 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7003.magru.wmnet
* 08:44 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp1005.wikimedia.org
* 08:44 jmm@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-test-eqiad
* 08:44 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 08:39 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp-test1005.wikimedia.org
* 08:35 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp-test1005.wikimedia.org
* 08:33 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp-test2005.wikimedia.org
* 08:29 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp-test2005.wikimedia.org
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7004.magru.wmnet
* 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7004.magru.wmnet
* 08:22 taavi@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3
* 08:18 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251276{{!}}Fix broken survey links on PersonalDashboard (T419950)]] (duration: 32m 09s)
* 08:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7004.magru.wmnet
* 08:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7004.magru.wmnet
* 08:06 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 08:05 kgraessle@deploy2002: kgraessle: Continuing with sync
* 08:04 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1251276{{!}}Fix broken survey links on PersonalDashboard (T419950)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:59 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 07:52 moritzm: installing Linux 5.10.251 on Bullseye hosts
* 07:45 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1251276{{!}}Fix broken survey links on PersonalDashboard (T419950)]]
* 07:37 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stewards1001.eqiad.wmnet
* 07:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host stewards1001.eqiad.wmnet
* 07:33 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
* 07:26 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
* 07:25 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aphlict1002.eqiad.wmnet
* 07:21 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host aphlict1002.eqiad.wmnet
* 07:10 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host doc2003.codfw.wmnet
* 07:06 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host doc2003.codfw.wmnet
* 07:02 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
* 06:55 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
* 05:25 aikochou@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 52s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-15 ==
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 52s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-14 ==
* 14:16 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251941{{!}}CommonSettings: Set class in $wgCentralAuthRC]] (duration: 06m 17s)
* 14:12 reedy@deploy2002: reedy: Continuing with sync
* 14:11 reedy@deploy2002: reedy: Backport for [[gerrit:1251941{{!}}CommonSettings: Set class in $wgCentralAuthRC]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:10 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1251941{{!}}CommonSettings: Set class in $wgCentralAuthRC]]
* 12:51 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251912{{!}}CommonSettings: Specify class in IRC RCFeed setup]] (duration: 06m 19s)
* 12:47 reedy@deploy2002: reedy, lcawte: Continuing with sync
* 12:46 reedy@deploy2002: reedy, lcawte: Backport for [[gerrit:1251912{{!}}CommonSettings: Specify class in IRC RCFeed setup]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:44 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1251912{{!}}CommonSettings: Specify class in IRC RCFeed setup]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 00s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-13 ==
* 22:52 taavi: taavi@deploy2002 ~ $ mwscript CentralAuth:attachAccount.php --wiki=metawiki --userlist backfiller.txt # unify unified Special:CentralAuth/MediaWikiAccountBackfiller on meta
* 20:07 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 20:01 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 20:01 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 19:55 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4052.*
* 19:54 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 19:54 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS trixie
* 19:53 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2005-dev.codfw.wmnet
* 19:46 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2005-dev.codfw.wmnet
* 19:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 19:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4050.*
* 19:40 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 19:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 19:24 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 19:23 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4050.ulsfo.wmnet
* 19:19 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1035.eqiad.wmnet with OS trixie
* 19:19 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1034.eqiad.wmnet with OS trixie
* 19:18 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:18 jclark@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:18 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:16 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4051.*
* 19:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp4050.ulsfo.wmnet
* 19:14 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4050.ulsfo.wmnet
* 19:13 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:11 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4051.ulsfo.wmnet with OS trixie
* 19:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp4050.ulsfo.wmnet
* 19:02 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1035.eqiad.wmnet with reason: host reimage
* 19:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS bookworm
* 19:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:00 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 18:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1034.eqiad.wmnet with reason: host reimage
* 18:58 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie
* 18:57 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4052.ulsfo.wmnet with OS trixie
* 18:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1035.eqiad.wmnet with reason: host reimage
* 18:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1034.eqiad.wmnet with reason: host reimage
* 18:47 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4051.ulsfo.wmnet with reason: host reimage
* 18:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 18:41 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4051.ulsfo.wmnet with reason: host reimage
* 18:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1035.eqiad.wmnet with OS trixie
* 18:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1034.eqiad.wmnet with OS trixie
* 18:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1033.eqiad.wmnet with OS trixie
* 18:36 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp4050.ulsfo.wmnet with reason: firmware updates
* 18:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie
* 18:24 brett@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp4050.ulsfo.wmnet
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie
* 18:22 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS bookworm
* 18:21 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1374.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 18:21 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4051.ulsfo.wmnet with OS trixie
* 18:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4051.ulsfo.wmnet with OS trixie
* 18:12 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS bookworm
* 18:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1374.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 18:10 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:10 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 18:10 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 18:10 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1253.eqiad.wmnet with reason: Host went down and paged, depooled
* 18:06 cgoubert@cumin1003: dbctl commit (dc=all): 'Depool db1253', diff saved to https://phabricator.wikimedia.org/P89856 and previous config saved to /var/cache/conftool/dbconfig/20260313-180640-cgoubert.json
* 18:06 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 18:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4051.ulsfo.wmnet with OS trixie
* 18:03 elukey: powercycle db1253 - host not reachable via ssh, no events logged in racadm getsel, no console com2 available (blank screen)
* 17:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 17:56 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 17:49 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4049.*
* 17:46 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4049.ulsfo.wmnet with OS trixie
* 17:37 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:37 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:36 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 17:35 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4050.ulsfo.wmnet with OS trixie
* 17:35 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:34 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:27 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 17:26 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:26 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 17:26 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 17:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4049.ulsfo.wmnet with reason: host reimage
* 17:17 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 17:17 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:16 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:16 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4049.ulsfo.wmnet with reason: host reimage
* 17:12 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1016.eqiad.wmnet
* 17:12 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1016.eqiad.wmnet
* 17:11 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1016.eqiad.wmnet
* 17:11 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4048.*
* 17:10 dhinus: (relogging failed sal) conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet
* 17:10 dhinus: (relogging failed sal) DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1016.eqiad.wmnet with reason: Rebooting clouddb1016 [[phab:T419960|T419960]]
* 17:09 dhinus: (relogging failed sal) END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1015.eqiad.wmnet
* 17:08 dhinus: (relogging failed sal) START - Cookbook sre.hosts.remove-downtime for clouddb1015.eqiad.wmnet
* 17:08 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 17:07 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:07 dhinus: fnegri@cumin1003 conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet
* 17:07 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 17:07 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4048.ulsfo.wmnet with OS trixie
* 17:06 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 16:40 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4049.ulsfo.wmnet with OS trixie
* 16:39 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage
* 16:36 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet
* 16:35 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1015.eqiad.wmnet with reason: Rebooting clouddb1015 [[phab:T419960|T419960]]
* 16:34 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1014.eqiad.wmnet
* 16:34 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1014.eqiad.wmnet
* 16:34 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
* 16:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
* 16:28 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudweb1003.wikimedia.org
* 16:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
* 16:22 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudweb1003.wikimedia.org
* 16:21 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudweb1004.wikimedia.org
* 16:20 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1014.eqiad.wmnet with reason: Rebooting clouddb1014 [[phab:T419960|T419960]]
* 16:20 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1014.eqiad.wmnet
* 16:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor-dev2001.codfw.wmnet
* 16:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4048.ulsfo.wmnet with OS trixie
* 16:16 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudweb1004.wikimedia.org
* 16:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor-dev2001.codfw.wmnet
* 16:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor2003.codfw.wmnet
* 16:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet
* 16:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:00 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie
* 15:43 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2010-dev.codfw.wmnet
* 15:38 vgutierrez@cumin1003: END (PASS) - Cookbook sre.loadbalancer.check-ipip (exit_code=0)
* 15:38 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:37 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 15:37 vgutierrez@cumin1003: END (FAIL) - Cookbook sre.loadbalancer.check-ipip (exit_code=99)
* 15:37 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:36 vgutierrez@cumin1003: END (FAIL) - Cookbook sre.loadbalancer.check-ipip (exit_code=99)
* 15:36 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:36 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2010-dev.codfw.wmnet
* 15:35 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet
* 15:35 vgutierrez@cumin1003: END (FAIL) - Cookbook sre.loadbalancer.check-ipip (exit_code=99)
* 15:35 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:28 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
* 15:26 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:25 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:23 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:22 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 15:22 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 15:22 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 15:19 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:19 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:19 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:16 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2006-dev.codfw.wmnet
* 15:12 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudidp2001-dev.codfw.wmnet
* 15:08 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudidp2001-dev.codfw.wmnet
* 15:07 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2006-dev.codfw.wmnet
* 14:58 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
* 14:58 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
* 14:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s3
* 14:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s1
* 14:48 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1015.eqiad.wmnet
* 14:46 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet
* 14:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2003.codfw.wmnet
* 14:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1373.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1015.eqiad.wmnet
* 14:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1034.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:42 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:40 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1023
* 14:40 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1023
* 14:40 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1022
* 14:40 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1022
* 14:40 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1021
* 14:39 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup2004.codfw.wmnet
* 14:39 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1021
* 14:38 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
* 14:37 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1020
* 14:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2003.codfw.wmnet
* 14:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:35 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1020
* 14:35 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T419960|T419960]]
* 14:33 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:33 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
* 14:32 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:32 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:32 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1034.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1373.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:29 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:29 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt - jclark@cumin1003"
* 14:29 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt - jclark@cumin1003"
* 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti-test2002.codfw.wmnet
* 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 14:27 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup2004.codfw.wmnet
* 14:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cuminunpriv1001.eqiad.wmnet
* 14:25 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup2003.codfw.wmnet
* 14:25 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 14:25 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 14:24 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 14:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cuminunpriv1001.eqiad.wmnet
* 14:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 14:22 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1004.eqiad.wmnet
* 14:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti-test2002.codfw.wmnet
* 14:14 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup2003.codfw.wmnet
* 14:13 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup1004.eqiad.wmnet
* 14:09 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1003.eqiad.wmnet
* 14:01 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup1003.eqiad.wmnet
* 13:59 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit1003.wikimedia.org
* 13:53 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit1003.wikimedia.org
* 13:49 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists2001.wikimedia.org
* 13:48 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1004.eqiad.wmnet
* 13:46 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host etherpad1004.eqiad.wmnet
* 13:45 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:45 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:44 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1004.eqiad.wmnet
* 13:42 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists2001.wikimedia.org
* 13:42 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host etherpad1004.eqiad.wmnet
* 13:37 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host etherpad2002.codfw.wmnet
* 13:36 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit2002.wikimedia.org
* 13:33 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host etherpad2002.codfw.wmnet
* 13:32 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2003.wikimedia.org
* 13:30 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit2002.wikimedia.org
* 13:26 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2003.wikimedia.org
* 13:26 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 13:24 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2020.codfw.wmnet
* 13:23 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2019.codfw.wmnet
* 13:19 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 13:19 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org
* 13:13 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2020.codfw.wmnet
* 13:13 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org
* 13:12 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2019.codfw.wmnet
* 13:11 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.reboot-runner (exit_code=0) rolling reboot on A:gitlab-runner
* 13:05 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2018.codfw.wmnet
* 13:05 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1020.eqiad.wmnet
* 12:54 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2018.codfw.wmnet
* 12:54 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1020.eqiad.wmnet
* 12:54 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2017.codfw.wmnet
* 12:54 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1019.eqiad.wmnet
* 12:53 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:50 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:50 moritzm: powercycle pki1002
* 12:48 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:47 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:44 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:44 mutante: rebooted phab1005 - waiting for it to come back
* 12:44 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2017.codfw.wmnet
* 12:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1019.eqiad.wmnet
* 12:42 bwojtowicz@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:40 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1018.eqiad.wmnet
* 12:39 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2016.codfw.wmnet
* 12:31 jelto@cumin1003: START - Cookbook sre.gitlab.reboot-runner rolling reboot on A:gitlab-runner
* 12:29 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1018.eqiad.wmnet
* 12:29 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1017.eqiad.wmnet
* 12:28 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2016.codfw.wmnet
* 12:27 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2015.codfw.wmnet
* 12:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast1004.wikimedia.org
* 12:18 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host doc1004.eqiad.wmnet
* 12:18 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1017.eqiad.wmnet
* 12:17 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:17 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:15 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2015.codfw.wmnet
* 12:15 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:15 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:14 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host doc1004.eqiad.wmnet
* 12:13 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aphlict2001.codfw.wmnet
* 12:10 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host aphlict2001.codfw.wmnet
* 12:10 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: reboot
* 12:10 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts2002.codfw.wmnet
* 12:07 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:07 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:03 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts2002.codfw.wmnet
* 12:02 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:02 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:01 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1016.eqiad.wmnet
* 12:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1019.eqiad.wmnet
* 11:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1018.eqiad.wmnet
* 11:59 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:59 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1019.eqiad.wmnet
* 11:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1018.eqiad.wmnet
* 11:51 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:51 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:50 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1016.eqiad.wmnet
* 11:49 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup2004.codfw.wmnet
* 11:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup2004.codfw.wmnet
* 11:43 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup1004.eqiad.wmnet
* 11:37 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup1004.eqiad.wmnet
* 11:36 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup2003.codfw.wmnet
* 11:34 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup1003.eqiad.wmnet
* 11:32 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 11:32 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:30 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup2003.codfw.wmnet
* 11:28 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup1003.eqiad.wmnet
* 11:27 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:26 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy1001.eqiad.wmnet
* 11:21 arnaudb@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host contint1003.wikimedia.org
* 11:21 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy1001.eqiad.wmnet
* 11:21 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy1002.eqiad.wmnet
* 11:16 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy1002.eqiad.wmnet
* 11:16 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy2001.codfw.wmnet
* 11:16 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host contint1003.wikimedia.org
* 11:12 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-master-codfw
* 11:12 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul1001.eqiad.wmnet
* 11:11 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy2001.codfw.wmnet
* 11:11 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy2002.codfw.wmnet
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1018.eqiad.wmnet with reason: host reimage
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:09 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-master-eqiad
* 11:08 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:08 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul1001.eqiad.wmnet
* 11:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1008-dev.eqiad.wmnet
* 11:07 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy2002.codfw.wmnet
* 11:06 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy3001.esams.wmnet
* 11:05 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1018.eqiad.wmnet with reason: host reimage
* 11:01 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy3001.esams.wmnet
* 11:01 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1008-dev.eqiad.wmnet
* 11:01 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1002-dev.eqiad.wmnet
* 11:01 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy3002.esams.wmnet
* 10:59 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 22:00:00 on db1258.eqiad.wmnet with reason: depooled, likely to flap over the weekend
* 10:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudbackup1002-dev.eqiad.wmnet
* 10:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1001-dev.eqiad.wmnet
* 10:56 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy3002.esams.wmnet
* 10:56 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-master-codfw
* 10:55 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4001.ulsfo.wmnet
* 10:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-worker-codfw
* 10:54 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudbackup1001-dev.eqiad.wmnet
* 10:52 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-master-eqiad
* 10:50 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-worker-eqiad
* 10:50 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 10:50 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4001.ulsfo.wmnet
* 10:50 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4002.ulsfo.wmnet
* 10:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1019.eqiad.wmnet with reason: host reimage
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1019.eqiad.wmnet with reason: host reimage
* 10:45 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4002.ulsfo.wmnet
* 10:45 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy5001.eqsin.wmnet
* 10:40 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy5001.eqsin.wmnet
* 10:39 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy5002.eqsin.wmnet
* 10:37 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul2001.codfw.wmnet
* 10:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool', diff saved to https://phabricator.wikimedia.org/P89852 and previous config saved to /var/cache/conftool/dbconfig/20260313-103719-ladsgroup.json
* 10:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul2001.codfw.wmnet
* 10:32 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy5002.eqsin.wmnet
* 10:31 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudweb2002-dev.wikimedia.org
* 10:31 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul1002.eqiad.wmnet
* 10:31 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy6001.drmrs.wmnet
* 10:29 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:29 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:28 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:28 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 10:27 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 10:27 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:27 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul1002.eqiad.wmnet
* 10:27 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:26 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy6001.drmrs.wmnet
* 10:24 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudweb2002-dev.wikimedia.org
* 10:23 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy6002.drmrs.wmnet
* 10:22 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul2002.codfw.wmnet
* 10:19 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1008.eqiad.wmnet
* 10:18 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul2002.codfw.wmnet
* 10:18 arnaudb@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host zuul2002.codfw.wmnet
* 10:18 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul2002.codfw.wmnet
* 10:18 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy6002.drmrs.wmnet
* 10:16 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy7002.magru.wmnet
* 10:16 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-worker-eqiad
* 10:15 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-worker-codfw
* 10:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1008.eqiad.wmnet
* 10:13 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1007.eqiad.wmnet
* 10:12 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy7002.magru.wmnet
* 10:09 jelto@cumin1003: conftool action : set/pooled=yes; selector: name=tcp-proxy7001.magru.wmnet
* 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1007.eqiad.wmnet
* 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1006.eqiad.wmnet
* 10:07 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy7001.magru.wmnet
* 10:03 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy7001.magru.wmnet
* 10:02 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1006.eqiad.wmnet
* 10:02 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1005.eqiad.wmnet
* 10:01 jelto@cumin1003: conftool action : set/pooled=no; selector: name=tcp-proxy7001.magru.wmnet
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 09:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1005.eqiad.wmnet
* 09:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1004.eqiad.wmnet
* 09:51 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1004.eqiad.wmnet
* 09:51 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1003.eqiad.wmnet
* 09:50 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:50 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:46 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1003.eqiad.wmnet
* 09:46 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1002.eqiad.wmnet
* 09:41 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1002.eqiad.wmnet
* 09:40 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1001.eqiad.wmnet
* 09:39 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:39 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:35 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1001.eqiad.wmnet
* 09:35 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-ctrl1002.eqiad.wmnet
* 09:34 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-ctrl1001.eqiad.wmnet
* 09:34 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:33 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:32 moritzm: installing Linux 6.1.164 on Bookworm hosts
* 09:30 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-ctrl1002.eqiad.wmnet
* 09:28 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-ctrl1001.eqiad.wmnet
* 09:01 elukey@cumin1003: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 08:37 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 07:56 moritzm: installing Linux 6.12.74 on Trixie hosts
* 07:55 moritzm: installing 6.12.74 on Trixie hosts
* 02:57 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4044.ulsfo.wmnet [reason: trixie reimaging]
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 18s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:41 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4044.ulsfo.wmnet with OS trixie
* 01:37 mutante: contint1003/contint2003 - every time(?) we setup machines with puppet using our httpd module and PHP - and puppet runs for the first time we run into the same old issue with "Exec[ensure_present_mod_php" failing and "Considering conflict mpm_worker for mpm_prefork"sudo a2dismod mpm_event". The fix is: 'sudo a2dismod mpm_event' and run puppet again. [[phab:T418521|T418521]]
* 01:26 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on contint1003.wikimedia.org with reason: [[phab:T418521|T418521]]
* 01:26 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on contint2003.wikimedia.org with reason: [[phab:T418521|T418521]]
* 01:23 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on contint2003.wikimedia.org with reason: setup
* 01:22 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on contint1003.wikimedia.org with reason: setup
* 01:22 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4047.*
* 01:09 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 01:08 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4043.ulsfo.wmnet [reason: trixie reimaging]
* 01:06 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 01:05 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4043.ulsfo.wmnet with OS trixie
* 00:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4047.ulsfo.wmnet with OS trixie
* 00:45 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4044.ulsfo.wmnet with OS trixie
* 00:45 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4044.ulsfo.wmnet [reason: trixie reimaging]
* 00:42 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4042.ulsfo.wmnet [reason: trixie reimaging]
* 00:41 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4042.ulsfo.wmnet with OS trixie
* 00:39 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4043.ulsfo.wmnet with reason: host reimage
* 00:31 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4043.ulsfo.wmnet with reason: host reimage
* 00:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4047.ulsfo.wmnet with reason: host reimage
* 00:27 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1251187 [[phab:T419637|T419637]] (duration: 07m 12s)
* 00:23 rzl@deploy2002: rzl: Continuing with sync
* 00:23 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4047.ulsfo.wmnet with reason: host reimage
* 00:22 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1251187 [[phab:T419637|T419637]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:21 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1251187 [[phab:T419637|T419637]]
* 00:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 00:14 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4040.ulsfo.wmnet [reason: trixie reimaging]
* 00:11 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 00:11 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4043.ulsfo.wmnet with OS trixie
* 00:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4040.ulsfo.wmnet with OS trixie
* 00:04 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4047.ulsfo.wmnet with OS trixie
* 00:03 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4047.ulsfo.wmnet with OS trixie
* 00:03 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4043.ulsfo.wmnet with OS trixie
== 2026-03-12 ==
* 23:57 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host o11ytest1001.eqiad.wmnet with OS trixie
* 23:53 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 23:53 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 23:50 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie
* 23:49 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 23:49 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 23:45 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 23:45 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 23:45 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 23:44 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4042.ulsfo.wmnet with OS trixie
* 23:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 23:41 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 23:41 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 23:40 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on o11ytest1001.eqiad.wmnet with reason: host reimage
* 23:36 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 23:36 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on o11ytest1001.eqiad.wmnet with reason: host reimage
* 23:36 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 23:35 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 23:35 rzl@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 23:22 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host o11ytest1001
* 23:22 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest1001
* 23:21 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie
* 23:19 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4040.ulsfo.wmnet with OS trixie
* 23:18 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest1001
* 23:18 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest1001.eqiad.wmnet 141.32.64.10.in-addr.arpa 1.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 23:18 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest1001.eqiad.wmnet 141.32.64.10.in-addr.arpa 1.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 23:18 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:18 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest1001 - herron@cumin1003"
* 23:18 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest1001 - herron@cumin1003"
* 23:04 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4047.ulsfo.wmnet with OS trixie
* 23:00 herron@cumin1003: START - Cookbook sre.dns.netbox
* 23:00 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host o11ytest1001
* 22:59 herron@cumin1003: START - Cookbook sre.hosts.reimage for host o11ytest1001.eqiad.wmnet with OS trixie
* 22:58 herron@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mwlog1002 to o11ytest1001
* 22:57 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest1001
* 22:55 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest1001
* 22:55 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest1001 on all recursors
* 22:55 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest1001 on all recursors
* 22:55 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:55 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog1002 to o11ytest1001 - herron@cumin1003"
* 22:54 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog1002 to o11ytest1001 - herron@cumin1003"
* 22:51 herron@cumin1003: START - Cookbook sre.dns.netbox
* 22:50 herron@cumin1003: START - Cookbook sre.hosts.rename from mwlog1002 to o11ytest1001
* 22:42 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4043.ulsfo.wmnet with OS trixie
* 22:42 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4043.ulsfo.wmnet [reason: trixie reimaging]
* 22:41 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4041.ulsfo.wmnet [reason: trixie reimaging]
* 22:39 bvibber@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251190{{!}}Enable ReaderExperiments Share Highlight subfeature for metrics (T416945)]], [[gerrit:1251195{{!}}Metrics module for share highlight experiment baseline (T416945)]] (duration: 06m 49s)
* 22:38 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4041.ulsfo.wmnet with OS trixie
* 22:35 bvibber@deploy2002: bvibber: Continuing with sync
* 22:34 bvibber@deploy2002: bvibber: Backport for [[gerrit:1251190{{!}}Enable ReaderExperiments Share Highlight subfeature for metrics (T416945)]], [[gerrit:1251195{{!}}Metrics module for share highlight experiment baseline (T416945)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:32 bvibber@deploy2002: Started scap sync-world: Backport for [[gerrit:1251190{{!}}Enable ReaderExperiments Share Highlight subfeature for metrics (T416945)]], [[gerrit:1251195{{!}}Metrics module for share highlight experiment baseline (T416945)]]
* 22:28 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1251182 [[phab:T419637|T419637]] (duration: 11m 18s)
* 22:27 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host o11ytest2001.codfw.wmnet with OS trixie
* 22:26 rzl@deploy2002: rzl: Continuing with sync
* 22:24 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1251182 [[phab:T419637|T419637]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:23 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie
* 22:23 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4042.ulsfo.wmnet [reason: trixie reimaging]
* 22:20 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4046.*
* 22:17 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1251182 [[phab:T419637|T419637]]
* 22:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4041.ulsfo.wmnet with reason: host reimage
* 22:09 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on o11ytest2001.codfw.wmnet with reason: host reimage
* 22:08 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4041.ulsfo.wmnet with reason: host reimage
* 22:03 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on o11ytest2001.codfw.wmnet with reason: host reimage
* 22:01 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:58 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie
* 21:58 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4040.ulsfo.wmnet [reason: trixie reimaging]
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host o11ytest2001
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest2001
* 21:45 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest2001
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest2001.codfw.wmnet 9.32.192.10.in-addr.arpa 9.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:45 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest2001.codfw.wmnet 9.32.192.10.in-addr.arpa 9.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest2001 - herron@cumin1003"
* 21:45 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest2001 - herron@cumin1003"
* 21:43 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4041.ulsfo.wmnet with OS trixie
* 21:41 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4038.ulsfo.wmnet [reason: trixie reimaging]
* 21:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4038.ulsfo.wmnet with OS trixie
* 21:39 herron@cumin1003: START - Cookbook sre.dns.netbox
* 21:39 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host o11ytest2001
* 21:39 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:39 herron@cumin1003: START - Cookbook sre.hosts.reimage for host o11ytest2001.codfw.wmnet with OS trixie
* 21:36 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:35 herron@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mwlog2002 to o11ytest2001
* 21:35 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:35 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:35 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest2001
* 21:34 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:34 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:33 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:32 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest2001
* 21:32 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest2001 on all recursors
* 21:32 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest2001 on all recursors
* 21:32 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:32 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog2002 to o11ytest2001 - herron@cumin1003"
* 21:31 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog2002 to o11ytest2001 - herron@cumin1003"
* 21:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4046.ulsfo.wmnet with OS trixie
* 21:27 herron@cumin1003: START - Cookbook sre.dns.netbox
* 21:26 herron@cumin1003: START - Cookbook sre.hosts.rename from mwlog2002 to o11ytest2001
* 21:20 rzl: rzl@apt1002:~$ sudo -i reprepro copy trixie-wikimedia bullseye-wikimedia envoyproxy
* 21:20 rzl: rzl@apt1002:~$ sudo -i reprepro copy bookworm-wikimedia bullseye-wikimedia envoyproxy
* 21:20 rzl: rzl@apt1002:~$ sudo -i reprepro -C main includedeb bullseye-wikimedia /srv/wikimedia/pool/component/envoy-future/e/envoyproxy/envoyproxy_1.35.9-1_amd64.deb
* 21:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 21:13 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251173{{!}}Revert "Move post-processing of flaggedrevs views inside FlaggablePageView"]] (duration: 07m 28s)
* 21:09 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 21:09 cscott@deploy2002: cscott: Continuing with sync
* 21:07 cscott@deploy2002: cscott: Backport for [[gerrit:1251173{{!}}Revert "Move post-processing of flaggedrevs views inside FlaggablePageView"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:05 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1251173{{!}}Revert "Move post-processing of flaggedrevs views inside FlaggablePageView"]]
* 21:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 21:02 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251152{{!}}Use 'alwaysShowLogin' query parameter during login (T419723)]], [[gerrit:1251150{{!}}login: Add 'alwaysShowLogin' login URL parameter (T419723)]], [[gerrit:1251168{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1250750{{!}}Enable parser survey for opted-out users on ru/pt/ja/id wikis (T414852)]] (duration: 10m 41s)
* 20:58 tgr@deploy2002: tgr, jsn, cscott: Continuing with sync
* 20:58 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 20:54 tgr@deploy2002: tgr, jsn, cscott: Backport for [[gerrit:1251152{{!}}Use 'alwaysShowLogin' query parameter during login (T419723)]], [[gerrit:1251150{{!}}login: Add 'alwaysShowLogin' login URL parameter (T419723)]], [[gerrit:1251168{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1250750{{!}}Enable parser survey for opted-out users on ru/pt/ja/id wikis (T414852)]] synced to the testservers (see https://wikitech
* 20:52 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1251152{{!}}Use 'alwaysShowLogin' query parameter during login (T419723)]], [[gerrit:1251150{{!}}login: Add 'alwaysShowLogin' login URL parameter (T419723)]], [[gerrit:1251168{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1250750{{!}}Enable parser survey for opted-out users on ru/pt/ja/id wikis (T414852)]]
* 20:49 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 20:43 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251087{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251088{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251106{{!}}phpunit: Avoid unnecessary writes in generatePHPUnitConfig.php (T419107)]] (duration: 07m 37s)
* 20:39 tgr@deploy2002: tgr, daimona: Continuing with sync
* 20:37 tgr@deploy2002: tgr, daimona: Backport for [[gerrit:1251087{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251088{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251106{{!}}phpunit: Avoid unnecessary writes in generatePHPUnitConfig.php (T419107)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:37 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4046.ulsfo.wmnet with OS trixie
* 20:35 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1251087{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251088{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251106{{!}}phpunit: Avoid unnecessary writes in generatePHPUnitConfig.php (T419107)]]
* 20:35 jsn@deploy2002: Synchronized wmf-config/throttle.php: (no justification provided) (duration: 01m 57s)
* 20:32 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4045.*
* 20:28 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4041.ulsfo.wmnet with OS trixie
* 20:20 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 20:18 jsn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249364{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1251140{{!}}[arwikiquote] add namespace alias for NS_PROJECT (T419828)]], [[gerrit:1251098{{!}}Deploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1251164{{!}}Increase IP cap limit for azwiki (T419899)]] (duration: 11m 11s)
* 20:14 jsn@deploy2002: jsn, dani, nmw03, gergesshamon: Continuing with sync
* 20:09 jsn@deploy2002: jsn, dani, nmw03, gergesshamon: Backport for [[gerrit:1249364{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1251140{{!}}[arwikiquote] add namespace alias for NS_PROJECT (T419828)]], [[gerrit:1251098{{!}}Deploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1251164{{!}}Increase IP cap limit for azwiki (T419899)]] synced to the testservers (see https://wikitech.wikimedia.org/wik
* 20:07 jsn@deploy2002: Started scap sync-world: Backport for [[gerrit:1249364{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1251140{{!}}[arwikiquote] add namespace alias for NS_PROJECT (T419828)]], [[gerrit:1251098{{!}}Deploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1251164{{!}}Increase IP cap limit for azwiki (T419899)]]
* 19:21 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
* 19:21 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mathoid: apply
* 19:20 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
* 19:19 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mathoid: apply
* 19:16 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 19:16 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 19:15 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 19:14 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 19:13 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 19:12 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 19:12 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 19:11 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 19:07 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4041.ulsfo.wmnet with OS trixie
* 19:06 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4041.ulsfo.wmnet [reason: trixie reimaging]
* 19:06 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4039.ulsfo.wmnet [reason: trixie reimaging]
* 19:06 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 brennen@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251138{{!}}EditPage: Re-add catch block for MWException (T419883)]] (duration: 09m 46s)
* 19:04 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:04 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:04 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:04 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:03 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:03 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:01 brennen@deploy2002: somerandomdeveloper, brennen: Continuing with sync
* 18:59 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 18:57 brennen@deploy2002: somerandomdeveloper, brennen: Backport for [[gerrit:1251138{{!}}EditPage: Re-add catch block for MWException (T419883)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:55 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4039.ulsfo.wmnet with OS trixie
* 18:55 brennen@deploy2002: Started scap sync-world: Backport for [[gerrit:1251138{{!}}EditPage: Re-add catch block for MWException (T419883)]]
* 18:52 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mathoid: apply
* 18:52 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mathoid: apply
* 18:42 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp20(2[789]{{!}}3[0-9]{{!}}40).*,service=ats-be
* 18:34 brennen@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 18:29 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4039.ulsfo.wmnet with reason: host reimage
* 18:29 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:28 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Updating dse-k8s-worker1019 - btullis@cumin1003"
* 18:26 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2332.codfw.wmnet
* 18:26 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2332.codfw.wmnet
* 18:25 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Updating dse-k8s-worker1019 - btullis@cumin1003"
* 18:24 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4039.ulsfo.wmnet with reason: host reimage
* 18:23 brennen@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251139{{!}}Ensure that we always run ParserHooks::transformHtml() when using Parsoid (T419830)]] (duration: 14m 46s)
* 18:21 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 18:20 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4038.ulsfo.wmnet with OS trixie
* 18:19 brennen@deploy2002: cscott, brennen: Continuing with sync
* 18:18 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 18:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4045.ulsfo.wmnet with OS trixie
* 18:10 brennen@deploy2002: cscott, brennen: Backport for [[gerrit:1251139{{!}}Ensure that we always run ParserHooks::transformHtml() when using Parsoid (T419830)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:10 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1019.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:08 brennen@deploy2002: Started scap sync-world: Backport for [[gerrit:1251139{{!}}Ensure that we always run ParserHooks::transformHtml() when using Parsoid (T419830)]]
* 18:02 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1019.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:02 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 17:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4039.ulsfo.wmnet with OS trixie
* 17:58 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1019
* 17:58 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1019
* 17:56 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4039.ulsfo.wmnet [reason: trixie reimaging]
* 17:55 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp20(3[6-9]{{!}}4[012]).*
* 17:54 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:53 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet [reason: trixie reimaging]
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4045.ulsfo.wmnet with reason: host reimage
* 17:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4037.ulsfo.wmnet with OS trixie
* 17:49 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4045.ulsfo.wmnet with reason: host reimage
* 17:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:33 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:31 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1018
* 17:31 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1018
* 17:30 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:28 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS trixie
* 17:28 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4045.ulsfo.wmnet with OS trixie
* 17:27 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp203[0-5].*
* 17:24 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4037.ulsfo.wmnet with reason: host reimage
* 17:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:20 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:18 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4037.ulsfo.wmnet with reason: host reimage
* 17:17 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-backup1004.eqiad.wmnet with OS trixie
* 17:17 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 17:16 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 17:06 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp202[89].*
* 17:03 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp2027.*
* 16:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 16:58 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4038.ulsfo.wmnet [reason: trixie reimaging]
* 16:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup1004.eqiad.wmnet with reason: host reimage
* 16:58 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:57 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4037.ulsfo.wmnet with OS trixie
* 16:57 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet [reason: trixie reimaging]
* 16:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup1004.eqiad.wmnet with reason: host reimage
* 16:50 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:45 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 16:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 16:44 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 16:43 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 16:43 swfrench-wmf: reprepro include dh-php_5.5+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:42 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 16:41 swfrench-wmf: reprepro include php-defaults_94+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:37 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-backup1004.eqiad.wmnet with OS trixie
* 16:36 swfrench-wmf: reprepro include php8.3_8.3.30-1+wmf11u2+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:27 dzahn@dns1004: END - running authdns-update
* 16:26 dzahn@dns1004: START - running authdns-update
* 16:25 mutante: switching old status.wikimedia.org page away from rackspace [[phab:T414098|T414098]]
* 16:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS trixie
* 16:20 dzahn@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 16:20 dzahn@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 16:19 dzahn@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 16:19 dzahn@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 16:12 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 16:11 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 16:10 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 16:09 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 16:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 16:09 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 16:08 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 16:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 16:07 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 16:06 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 16:05 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 16:04 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 16:04 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 16:04 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 16:03 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 16:02 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 16:02 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 16:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 16:01 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 15:58 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 15:57 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 15:57 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 15:56 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 15:52 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudgw2002-dev.codfw.wmnet
* 15:52 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:52 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2002-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 15:47 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2002-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 15:43 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 15:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:40 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:36 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudgw2002-dev.codfw.wmnet
* 15:35 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 15:33 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 15:27 ebernhardson@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:26 ebernhardson@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:19 moritzm: reuploadd libxml2 2.9.10+dfsg-6.7+deb11u9+wmf11u1 and 72.1-3+deb12u1~wmf11u1 to component/php83-icu72 for bullseye-wikimedia [[phab:T419058|T419058]]
* 15:14 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:13 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:13 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=tcp-proxy4004.ulsfo.wmnet
* 15:13 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=tcp-proxy4004.ulsfo.wmnet
* 15:12 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=tcp-proxy4003.ulsfo.wmnet
* 15:12 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=tcp-proxy4003.ulsfo.wmnet
* 15:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 15:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:57 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:56 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:45 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:44 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:34 andrew@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 14:33 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:31 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 14:31 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1018
* 14:31 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1018
* 14:25 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:24 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:20 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet
* 14:15 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 24 hosts with reason: Switch BGP bounce
* 14:12 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet
* 14:09 mlitn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251034{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251035{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251036{{!}}Remove queueing logic (T419587)]], [[gerrit:1251037{{!}}Remove queueing logic (T419587)]] (duration: 07m 15s)
* 14:08 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 14:07 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:07 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:05 mlitn@deploy2002: mlitn: Continuing with sync
* 14:04 mlitn@deploy2002: mlitn: Backport for [[gerrit:1251034{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251035{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251036{{!}}Remove queueing logic (T419587)]], [[gerrit:1251037{{!}}Remove queueing logic (T419587)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:03 XioNoX: start eqiad rack D2 depools
* 14:02 mlitn@deploy2002: Started scap sync-world: Backport for [[gerrit:1251034{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251035{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251036{{!}}Remove queueing logic (T419587)]], [[gerrit:1251037{{!}}Remove queueing logic (T419587)]]
* 13:59 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:59 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:57 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:54 moritzm: installing libssh security updates
* 13:54 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:45 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251031{{!}}ext.testKitchen: Depend on mediawiki.user module]], [[gerrit:1251048{{!}}Add title to the request context in FlaggedRevsCacheTest (T419539)]], [[gerrit:1251032{{!}}ext.testKitchen: Depend on mediawiki.user module]] (duration: 08m 01s)
* 13:42 phuedx@deploy2002: phuedx: Continuing with sync
* 13:39 phuedx@deploy2002: phuedx: Backport for [[gerrit:1251031{{!}}ext.testKitchen: Depend on mediawiki.user module]], [[gerrit:1251048{{!}}Add title to the request context in FlaggedRevsCacheTest (T419539)]], [[gerrit:1251032{{!}}ext.testKitchen: Depend on mediawiki.user module]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:37 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1251031{{!}}ext.testKitchen: Depend on mediawiki.user module]], [[gerrit:1251048{{!}}Add title to the request context in FlaggedRevsCacheTest (T419539)]], [[gerrit:1251032{{!}}ext.testKitchen: Depend on mediawiki.user module]]
* 13:26 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251005{{!}}Deploy EditCheck suggestion mode at all Wikipedias (T415320)]] (duration: 06m 42s)
* 13:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 esanders@deploy2002: esanders: Continuing with sync
* 13:22 esanders@deploy2002: esanders: Backport for [[gerrit:1251005{{!}}Deploy EditCheck suggestion mode at all Wikipedias (T415320)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1178.eqiad.wmnet
* 13:21 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:20 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1251005{{!}}Deploy EditCheck suggestion mode at all Wikipedias (T415320)]]
* 13:18 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250656{{!}}Add multilingual revert risk host header for LiftWing requests (T419718)]] (duration: 10m 52s)
* 13:14 fnegri@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.add-wiki (exit_code=99) for database kaiwiki ([[phab:T414240|T414240]])
* 13:14 fnegri@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database kaiwiki ([[phab:T414240|T414240]])
* 13:14 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 13:14 kgraessle@deploy2002: kgraessle: Continuing with sync
* 13:12 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 13:11 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1250656{{!}}Add multilingual revert risk host header for LiftWing requests (T419718)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 13:07 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1250656{{!}}Add multilingual revert risk host header for LiftWing requests (T419718)]]
* 13:05 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1159.eqiad.wmnet
* 13:03 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 13:02 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 13:02 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:02 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:57 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1159.eqiad.wmnet
* 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1013.eqiad.wmnet
* 12:49 dpogorzelski@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: sync
* 12:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1013.eqiad.wmnet
* 12:49 dpogorzelski@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: sync
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:31 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum4004.ulsfo.wmnet
* 12:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4004.ulsfo.wmnet
* 12:28 moritzm: installing postgresql-17 security updates
* 12:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4004.ulsfo.wmnet
* 12:14 moritzm: installing wireshark security updates
* 12:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1013.eqiad.wmnet with reason: host reimage
* 12:07 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1013.eqiad.wmnet with reason: host reimage
* 11:52 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:51 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:50 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host tcp-proxy4004.ulsfo.wmnet
* 11:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy4004.ulsfo.wmnet with OS trixie
* 11:49 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy4004.ulsfo.wmnet with reason: host reimage
* 11:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy4004.ulsfo.wmnet with reason: host reimage
* 11:19 jayme: disabled puppet on all wikikube worker nodes to rollout/test new apparmor profiles in staging - [[phab:T419781|T419781]]
* 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy4004.ulsfo.wmnet with OS trixie
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy4004.ulsfo.wmnet on all recursors
* 11:06 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy4004.ulsfo.wmnet on all recursors
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 11:03 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:00 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 10:42 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device asw1-23-ulsfo
* 10:41 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 10:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1013.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 10:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4003.ulsfo.wmnet
* 10:31 vgutierrez@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4001.ulsfo.wmnet
* 10:31 vgutierrez@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4002.ulsfo.wmnet
* 10:31 vgutierrez@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4004.ulsfo.wmnet
* 10:30 vgutierrez@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4003.ulsfo.wmnet
* 10:30 vgutierrez: repooling ncredir4003 & ncredir4004
* 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4003.ulsfo.wmnet
* 10:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy4004.ulsfo.wmnet
* 10:26 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1013.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 10:26 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:25 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1013
* 10:22 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1013
* 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host tcp-proxy4003.ulsfo.wmnet
* 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy4003.ulsfo.wmnet with OS trixie
* 10:12 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1011.eqiad.wmnet
* 10:12 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:11 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:11 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:10 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:09 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1011.eqiad.wmnet
* 10:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1010.eqiad.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy4003.ulsfo.wmnet with reason: host reimage
* 10:00 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1010.eqiad.wmnet
* 09:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy4003.ulsfo.wmnet with reason: host reimage
* 09:48 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/SERVICE_NAME: apply
* 09:48 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/SERVICE_NAME: apply
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2024.codfw.wmnet
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2023.codfw.wmnet
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2022.codfw.wmnet
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2021.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2024.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2023.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2022.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2021.codfw.wmnet
* 09:39 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 09:39 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 09:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 09:39 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 09:38 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 09:38 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 09:35 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>ms-fe[2009-2020].codfw.wmnet<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 09:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host durum4004.ulsfo.wmnet
* 09:32 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 09:32 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy4003.ulsfo.wmnet with OS trixie
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy4003.ulsfo.wmnet on all recursors
* 09:30 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy4003.ulsfo.wmnet on all recursors
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:28 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P<nowiki>{</nowiki>ms-fe[2009-2020].codfw.wmnet<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 09:28 Emperor: roll-restart codfw ms frontends prior to pooling new ones [[phab:T416243|T416243]]
* 09:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum4003.ulsfo.wmnet
* 09:23 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:23 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy4003.ulsfo.wmnet
* 09:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host durum4003.ulsfo.wmnet
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts netflow4002.ulsfo.wmnet
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:51 slyngshede@dns1004: END - running authdns-update
* 08:50 slyngshede@dns1004: START - running authdns-update
* 08:37 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:32 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts netflow4002.ulsfo.wmnet
* 08:25 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 08:23 arnaudb@dns1004: END - running authdns-update
* 08:21 arnaudb@dns1004: START - running authdns-update
* 07:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir4004.ulsfo.wmnet
* 07:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncredir4004.ulsfo.wmnet
* 07:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir4003.ulsfo.wmnet
* 07:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncredir4003.ulsfo.wmnet
* 05:24 kart_: staging: machinetranslation: Optimize model loading and memory footprints ([[phab:T411058|T411058]])
* 05:19 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
* 05:16 kartik@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
* 02:16 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2005.codfw.wmnet with OS trixie
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 14s)
* 02:03 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:59 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
* 01:52 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
* 01:49 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:47 swfrench-wmf: reprepro include php-apcu_5.1.24-1+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:37 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2005.codfw.wmnet with OS trixie
* 01:36 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2004.codfw.wmnet with OS trixie
* 01:24 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7012.*
* 01:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 01:18 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
* 01:18 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 01:15 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
* 01:13 swfrench-wmf: reprepro include dh-php_5.5+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:08 swfrench-wmf: reprepro include php-defaults_94+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:05 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 01:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 01:03 swfrench-wmf: reprepro include php8.3_8.3.30-1+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:00 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2004.codfw.wmnet with OS trixie
* 00:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7012.magru.wmnet with OS trixie
* 00:59 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
* 00:58 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
* 00:38 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 00:38 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 00:37 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 00:37 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 00:36 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 00:36 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 00:33 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 00:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7012.magru.wmnet with reason: host reimage
* 00:27 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 00:24 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7012.magru.wmnet with reason: host reimage
* 00:03 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7012.magru.wmnet with OS trixie
== 2026-03-11 ==
* 23:56 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7009.*
* 22:52 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:52 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:45 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 22:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 22:29 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 22:29 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 22:27 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7009.magru.wmnet with OS trixie
* 21:56 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 21:55 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 21:54 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250051{{!}}OrchestratorRequest: Switch evaluations to v2 endpoint (T413727)]] (duration: 18m 19s)
* 21:47 jforrester@deploy2002: jforrester: Continuing with sync
* 21:43 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7009.magru.wmnet with reason: host reimage
* 21:42 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:40 jforrester@deploy2002: jforrester: Backport for [[gerrit:1250051{{!}}OrchestratorRequest: Switch evaluations to v2 endpoint (T413727)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:39 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7009.magru.wmnet with reason: host reimage
* 21:35 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1250051{{!}}OrchestratorRequest: Switch evaluations to v2 endpoint (T413727)]]
* 21:30 rzl: rzl@apt1002:~$ sudo -i reprepro -C component/envoy-future include bullseye-wikimedia /home/rzl/envoyproxy_1.35.9-1_amd64.changes
* 21:29 arlolra@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250665{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]], [[gerrit:1250666{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]] (duration: 35m 16s)
* 21:16 arlolra@deploy2002: arlolra: Continuing with sync
* 21:15 arlolra@deploy2002: arlolra: Backport for [[gerrit:1250665{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]], [[gerrit:1250666{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:08 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7009.magru.wmnet with OS trixie
* 21:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7010.*
* 21:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7010.magru.wmnet with OS trixie
* 20:54 arlolra@deploy2002: Started scap sync-world: Backport for [[gerrit:1250665{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]], [[gerrit:1250666{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]]
* 20:47 jsn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250579{{!}}urwikisource: add logo, sitename and projectnamespace (T415974)]] (duration: 06m 55s)
* 20:43 jsn@deploy2002: anzx, jsn: Continuing with sync
* 20:42 jsn@deploy2002: anzx, jsn: Backport for [[gerrit:1250579{{!}}urwikisource: add logo, sitename and projectnamespace (T415974)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:40 jsn@deploy2002: Started scap sync-world: Backport for [[gerrit:1250579{{!}}urwikisource: add logo, sitename and projectnamespace (T415974)]]
* 20:38 jsn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250581{{!}}riskyArticleEdits: show page descriptions (T419442)]], [[gerrit:1250582{{!}}Fix Instrumentation on mobile view (T419517)]], [[gerrit:1250632{{!}}ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570)]] (duration: 10m 37s)
* 20:38 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ml-serve1014.eqiad.wmnet with reason: [[phab:T400626|T400626]]
* 20:37 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7010.magru.wmnet with reason: host reimage
* 20:34 jsn@deploy2002: jsn, sfaci: Continuing with sync
* 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search-test: apply
* 20:33 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search-test: apply
* 20:32 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7010.magru.wmnet with reason: host reimage
* 20:30 jsn@deploy2002: jsn, sfaci: Backport for [[gerrit:1250581{{!}}riskyArticleEdits: show page descriptions (T419442)]], [[gerrit:1250582{{!}}Fix Instrumentation on mobile view (T419517)]], [[gerrit:1250632{{!}}ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:28 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on gitlab1003.wikimedia.org with reason: Upgrade
* 20:28 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on gitlab2002.wikimedia.org with reason: Upgrade
* 20:27 jsn@deploy2002: Started scap sync-world: Backport for [[gerrit:1250581{{!}}riskyArticleEdits: show page descriptions (T419442)]], [[gerrit:1250582{{!}}Fix Instrumentation on mobile view (T419517)]], [[gerrit:1250632{{!}}ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570)]]
* 20:21 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:18 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 20:17 bvibber@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250647{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]], [[gerrit:1250648{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]] (duration: 06m 47s)
* 20:13 bvibber@deploy2002: bvibber: Continuing with sync
* 20:12 bvibber@deploy2002: bvibber: Backport for [[gerrit:1250647{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]], [[gerrit:1250648{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 bvibber@deploy2002: Started scap sync-world: Backport for [[gerrit:1250647{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]], [[gerrit:1250648{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]]
* 19:59 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7010.magru.wmnet with OS trixie
* 19:54 andrew@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:51 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 19:37 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-backup1004.eqiad.wmnet with OS trixie
* 19:01 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp7011.magru.wmnet
* 19:01 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7011.magru.wmnet
* 18:56 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough
* 18:49 brennen@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 18:49 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:49 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:45 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:45 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:44 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:44 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:43 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough
* 18:42 brennen: 1.46.0-wmf.19 train status: no current blockers, going ahead to group1.
* 18:39 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2332.codfw.wmnet
* 18:37 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2332.codfw.wmnet
* 18:20 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7011.*
* 18:18 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 18:16 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-backup1004.eqiad.wmnet with OS trixie
* 18:13 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 17:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1010.eqiad.wmnet with reason: host reimage
* 17:52 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:52 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:48 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:47 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1010.eqiad.wmnet with reason: host reimage
* 17:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 17:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 17:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:38 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:38 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:36 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:36 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:36 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:35 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 17:34 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 17:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
* 17:31 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
* 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:28 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:28 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:20 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:19 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:19 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:18 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:15 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:13 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:12 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:09 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:09 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:02 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7011.magru.wmnet with OS trixie
* 17:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 17:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 16:58 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum4004.ulsfo.wmnet with reason: in setup
* 16:58 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum4003.ulsfo.wmnet with reason: in setup
* 16:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 16:40 root@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:40 root@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moving many things from cloudgw2002-dev to cloudgw2004-dev - root@cumin2002"
* 16:40 root@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moving many things from cloudgw2002-dev to cloudgw2004-dev - root@cumin2002"
* 16:39 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 16:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 16:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7011.magru.wmnet with reason: host reimage
* 16:35 root@cumin2002: START - Cookbook sre.dns.netbox
* 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus4002.ulsfo.wmnet
* 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - tappof@cumin1003"
* 16:30 tappof@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - tappof@cumin1003"
* 16:30 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7011.magru.wmnet with reason: host reimage
* 16:25 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 16:23 tappof@cumin1003: START - Cookbook sre.dns.netbox
* 16:18 tappof@cumin1003: START - Cookbook sre.hosts.decommission for hosts prometheus4002.ulsfo.wmnet
* 15:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:52 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7011.magru.wmnet with OS trixie
* 15:51 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 15:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 15:50 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:49 urbanecm@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:48 sukhe: sudo cumin -b1 -s10 "C:dnsrecursor" "run-puppet-agent --enable 'merging CR 1250576'"
* 15:48 urbanecm@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:46 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:45 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 15:43 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:39 sukhe: sudo cumin "C:dnsrecursor" "disable-puppet 'merging CR 1250576'"
* 15:35 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:26 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:08 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 15:08 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 15:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:53 swfrench-wmf: updated component/php83-icu72 with libpcre2 10.42-1~wmf11+1 from apt-staging - [[phab:T419058|T419058]]
* 14:46 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:45 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum4004.ulsfo.wmnet
* 14:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum4004.ulsfo.wmnet with OS trixie
* 14:39 vgutierrez: depool ncredir4003 && ncredir4004
* 14:38 vgutierrez: repool ncredir4001 && ncredir4002
* 14:31 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4002.ulsfo.wmnet
* 14:31 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4001.ulsfo.wmnet
* 14:30 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4004.ulsfo.wmnet
* 14:30 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=ncredir4004.ulsfo.wmnet
* 14:27 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4003.ulsfo.wmnet
* 14:27 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=ncredir4003.ulsfo.wmnet
* 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4004.ulsfo.wmnet with reason: host reimage
* 14:23 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:23 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:19 moritzm: installing python-urllib3 security updates
* 14:18 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum4004.ulsfo.wmnet with reason: host reimage
* 14:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:13 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:12 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:12 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:12 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:12 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:11 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:11 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:11 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:08 gkyziridis@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:08 gkyziridis@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:07 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250568{{!}}Fix pinnableElement export (T419620)]] (duration: 06m 26s)
* 14:06 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:04 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:04 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:03 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:03 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 14:03 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:02 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1250568{{!}}Fix pinnableElement export (T419620)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:00 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1250568{{!}}Fix pinnableElement export (T419620)]]
* 13:58 moritzm: uploaded libxml2 2.9.10+dfsg-6.7+deb11u9+wmf11u1 to component/php83-icu72 for bullseye-wikimedia (special build of libxml with ICU disabled to ensure co-installabiliy between icu 67 and icu 72) [[phab:T419058|T419058]]
* 13:57 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250566{{!}}Restore advanced main menu for AMC (T413912)]] (duration: 10m 44s)
* 13:55 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum4004.ulsfo.wmnet with OS trixie
* 13:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:54 vgutierrez: repool cp7016
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum4004.ulsfo.wmnet on all recursors
* 13:54 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum4004.ulsfo.wmnet on all recursors
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:51 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 13:50 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1250566{{!}}Restore advanced main menu for AMC (T413912)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:49 vgutierrez: depool cp7016
* 13:49 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:46 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1250566{{!}}Restore advanced main menu for AMC (T413912)]]
* 13:45 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:44 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:44 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247547{{!}}Remove `MetricsPlatform` configuration from production (T416865)]] (duration: 35m 52s)
* 13:43 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 13:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4004.ulsfo.wmnet with OS bookworm
* 13:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:36 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum4004.ulsfo.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum4003.ulsfo.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum4003.ulsfo.wmnet with OS trixie
* 13:36 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 13:35 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 13:30 jdlrobson@deploy2002: jdlrobson, sfaci: Continuing with sync
* 13:29 jdlrobson@deploy2002: jdlrobson, sfaci: Backport for [[gerrit:1247547{{!}}Remove `MetricsPlatform` configuration from production (T416865)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4003.ulsfo.wmnet with reason: host reimage
* 13:18 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 13:13 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum4003.ulsfo.wmnet with reason: host reimage
* 13:08 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1247547{{!}}Remove `MetricsPlatform` configuration from production (T416865)]]
* 13:00 moritzm: installing libcommons-lang3-java security updates
* 12:57 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4004.ulsfo.wmnet with OS bookworm
* 12:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4003.ulsfo.wmnet with OS bookworm
* 12:46 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum4003.ulsfo.wmnet with OS trixie
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum4003.ulsfo.wmnet on all recursors
* 12:45 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum4003.ulsfo.wmnet on all recursors
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:41 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:37 moritzm: installing inetutils security updates
* 12:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:36 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum4003.ulsfo.wmnet
* 12:35 tappof: completed migration from prometheus4002 to prometheus4003 (ulsfo) (TT419430)
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 12:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 12:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 12:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2073.codfw.wmnet with OS bullseye
* 12:23 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 12:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet
* 12:18 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 12:17 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1011
* 12:17 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1011
* 12:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2072.codfw.wmnet with OS bullseye
* 12:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet
* 12:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2073.codfw.wmnet with reason: host reimage
* 12:04 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4003.ulsfo.wmnet with OS bookworm
* 12:01 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
* 11:59 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
* 11:58 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2073.codfw.wmnet with reason: host reimage
* 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2072.codfw.wmnet with reason: host reimage
* 11:48 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2072.codfw.wmnet with reason: host reimage
* 11:41 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1239954{{!}}[Growth] Enable on every new Wikipedia by default (T304052)]] (duration: 06m 39s)
* 11:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2073
* 11:38 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2073
* 11:37 vgutierrez: upgrading to acme-chief 0.39 on acme-chief production instances - [[phab:T419352|T419352]]
* 11:37 urbanecm@deploy2002: urbanecm: Continuing with sync
* 11:36 urbanecm@deploy2002: urbanecm: Backport for [[gerrit:1239954{{!}}[Growth] Enable on every new Wikipedia by default (T304052)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:36 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2073
* 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2073.codfw.wmnet 212.48.192.10.in-addr.arpa 2.1.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:36 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2073.codfw.wmnet 212.48.192.10.in-addr.arpa 2.1.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2073 - mvernon@cumin2002"
* 11:36 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2073 - mvernon@cumin2002"
* 11:35 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 11:34 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 11:34 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1239954{{!}}[Growth] Enable on every new Wikipedia by default (T304052)]]
* 11:34 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 11:34 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250539{{!}}[Growth] kaiwiki: Enable GrowthExperiments (T304052)]] (duration: 14m 11s)
* 11:33 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 11:33 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 11:32 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 11:32 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:31 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2073
* 11:30 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2073.codfw.wmnet with OS bullseye
* 11:30 urbanecm@deploy2002: urbanecm: Continuing with sync
* 11:29 cgoubert@dns1004: END - running authdns-update
* 11:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2072
* 11:29 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2072
* 11:28 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2072
* 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2072.codfw.wmnet 158.32.192.10.in-addr.arpa 8.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:28 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2072.codfw.wmnet 158.32.192.10.in-addr.arpa 8.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2072 - mvernon@cumin2002"
* 11:28 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2072 - mvernon@cumin2002"
* 11:28 cgoubert@dns1004: START - running authdns-update
* 11:26 urbanecm@deploy2002: mwscript-k8s job started: WikimediaMaintenance:createExtensionTables.php --wiki=kaiwiki growthexperiments # [[phab:T304052|T304052]]
* 11:24 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:24 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2072
* 11:23 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2072.codfw.wmnet with OS bullseye
* 11:22 tappof@dns1004: END - running authdns-update
* 11:22 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 11:21 urbanecm@deploy2002: urbanecm: Backport for [[gerrit:1250539{{!}}[Growth] kaiwiki: Enable GrowthExperiments (T304052)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:21 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 11:21 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
* 11:21 tappof@dns1004: START - running authdns-update
* 11:21 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
* 11:19 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1250539{{!}}[Growth] kaiwiki: Enable GrowthExperiments (T304052)]]
* 11:19 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 11:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2071.codfw.wmnet with OS bullseye
* 11:18 urbanecm@deploy2002: mwscript-k8s job started: WikimediaMaintenance:createExtensionTables.php --wiki=kaiwiki growthexperiments # [[phab:T304052|T304052]]
* 11:10 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
* 11:10 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
* 11:08 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 11:08 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 11:05 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 11:05 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 10:58 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2071.codfw.wmnet with reason: host reimage
* 10:54 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2071.codfw.wmnet with reason: host reimage
* 10:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2071
* 10:35 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2071
* 10:34 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2071
* 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2071.codfw.wmnet 221.16.192.10.in-addr.arpa 1.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:34 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2071.codfw.wmnet 221.16.192.10.in-addr.arpa 1.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2071 - mvernon@cumin2002"
* 10:34 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2071 - mvernon@cumin2002"
* 10:26 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 10:24 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2071
* 10:23 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2071.codfw.wmnet with OS bullseye
* 10:08 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2095.codfw.wmnet with OS bullseye
* 10:03 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Failed step after ml-serve1015's reimage - elukey@cumin1003"
* 10:02 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Failed step after ml-serve1015's reimage - elukey@cumin1003"
* 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1015.eqiad.wmnet with OS trixie
* 10:01 elukey@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 09:59 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2096.codfw.wmnet with OS bullseye
* 09:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2096.codfw.wmnet with OS bullseye
* 09:52 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:51 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2095.codfw.wmnet with OS bullseye
* 09:51 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:46 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 09:34 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 09:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 09:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 09:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 09:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 09:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 09:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 09:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 09:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 09:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 09:28 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 09:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 09:27 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 09:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 09:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 09:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 09:24 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 09:22 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir4004.ulsfo.wmnet
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4004.ulsfo.wmnet with OS bookworm
* 09:15 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:15 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:14 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 09:10 javiermonton@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249217{{!}}stream: mediawiki.page_html_content_change (T419258)]] (duration: 08m 28s)
* 09:07 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2096.codfw.wmnet with OS bullseye
* 09:06 javiermonton@deploy2002: javiermonton: Continuing with sync
* 09:03 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
* 09:03 javiermonton@deploy2002: javiermonton: Backport for [[gerrit:1249217{{!}}stream: mediawiki.page_html_content_change (T419258)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 09:01 javiermonton@deploy2002: Started scap sync-world: Backport for [[gerrit:1249217{{!}}stream: mediawiki.page_html_content_change (T419258)]]
* 08:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1015.eqiad.wmnet with reason: host reimage
* 08:58 trueg@deploy2002: helmfile [staging] DONE helmfile.d/services/SERVICE_NAME: apply
* 08:58 trueg@deploy2002: helmfile [staging] START helmfile.d/services/SERVICE_NAME: apply
* 08:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 08:55 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2239.codfw.wmnet with reason: mysql upgrade / restart
* 08:54 moritzm: installing imagemagick security updates
* 08:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1015.eqiad.wmnet with reason: host reimage
* 08:41 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1015.eqiad.wmnet with OS trixie
* 08:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1014.eqiad.wmnet with OS trixie
* 08:40 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:39 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:35 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4004.ulsfo.wmnet with OS bookworm
* 08:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir4004.ulsfo.wmnet on all recursors
* 08:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir4004.ulsfo.wmnet on all recursors
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1014.eqiad.wmnet with reason: host reimage
* 08:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:21 Msz2001: UTC morning backport window finished
* 08:21 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:21 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir4004.ulsfo.wmnet
* 08:21 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250426{{!}}Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages]] (duration: 10m 46s)
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir4003.ulsfo.wmnet
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4003.ulsfo.wmnet with OS bookworm
* 08:17 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1014.eqiad.wmnet with reason: host reimage
* 08:15 mszwarc@deploy2002: mszwarc: Continuing with sync
* 08:14 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1250426{{!}}Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:10 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1250426{{!}}Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages]]
* 08:09 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249921{{!}}Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422)]], [[gerrit:1250066{{!}}Send2FAWarningNotifications: Support reading users from file (T419111)]] (duration: 33m 07s)
* 08:05 moritzm: installing mariadb bugfix updates from Bookworm point release (tools and libraries as packaged in Debian, unrelated to the wmf-mariadb packages)
* 08:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1014.eqiad.wmnet with OS trixie
* 08:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 07:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 07:57 mszwarc@deploy2002: mszwarc: Continuing with sync
* 07:56 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1249921{{!}}Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422)]], [[gerrit:1250066{{!}}Send2FAWarningNotifications: Support reading users from file (T419111)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:44 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1049.eqiad.wmnet
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 07:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 07:38 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4003.ulsfo.wmnet with OS bookworm
* 07:36 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1249921{{!}}Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422)]], [[gerrit:1250066{{!}}Send2FAWarningNotifications: Support reading users from file (T419111)]]
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir4003.ulsfo.wmnet on all recursors
* 07:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir4003.ulsfo.wmnet on all recursors
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir4003.ulsfo.wmnet
* 07:22 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247639{{!}}Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727)]] (duration: 12m 24s)
* 07:18 kgraessle@deploy2002: kgraessle: Continuing with sync
* 07:12 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1247639{{!}}Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:09 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1247639{{!}}Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 59s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 00:33 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250117{{!}}Stop setting $wgImageLinksSchemaMigrationStage (T299953)]] (duration: 09m 38s)
* 00:29 zabe@deploy2002: zabe: Continuing with sync
* 00:26 zabe@deploy2002: zabe: Backport for [[gerrit:1250117{{!}}Stop setting $wgImageLinksSchemaMigrationStage (T299953)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:24 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1250117{{!}}Stop setting $wgImageLinksSchemaMigrationStage (T299953)]]
* 00:03 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 00:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint1003.wikimedia.org with OS trixie
* 00:03 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:03 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
== 2026-03-10 ==
* 23:58 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 23:53 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 23:49 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 23:44 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint1003.wikimedia.org with reason: host reimage
* 23:40 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on contint1003.wikimedia.org with reason: host reimage
* 23:31 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2096.codfw.wmnet with OS bullseye
* 23:31 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
* 23:26 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2095.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:24 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2096.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:22 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:11 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2096.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:05 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 23:05 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:59 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2095.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:39 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:38 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:51 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp7012.magru.wmnet with OS trixie
* 21:48 Dreamy_Jazz: Evening UTC backport window done
* 21:42 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7006.magru.wmnet [reason: trixie reimaging]
* 21:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7006.magru.wmnet with OS trixie
* 21:25 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1235552{{!}}Migrate EmailAuth, step 2 (T404334)]] (duration: 25m 34s)
* 21:24 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7007.magru.wmnet [reason: trixie reimaging]
* 21:22 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7007.magru.wmnet with OS trixie
* 21:21 tgr@deploy2002: tgr: Continuing with sync
* 21:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 21:09 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 21:02 tgr@deploy2002: tgr: Backport for [[gerrit:1235552{{!}}Migrate EmailAuth, step 2 (T404334)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:00 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1235552{{!}}Migrate EmailAuth, step 2 (T404334)]]
* 20:59 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7012.magru.wmnet with OS trixie
* 20:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7007.magru.wmnet with reason: host reimage
* {{safesubst:SAL entry|1=20:50 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249983{{!}}Deploy participant recruitment survey on ptwiki and trwiki (T419275)]], [[gerrit:1238733{{!}}wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402)]], [[gerrit:1238734{{!}}wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403)]], [[gerrit:1249393{{!}}build: Upgrade mediawiki-phan-config from 0.18.0 to 0.2}}
* 20:48 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7007.magru.wmnet with reason: host reimage
* 20:46 jforrester@deploy2002: dani, jforrester: Continuing with sync
* {{safesubst:SAL entry|1=20:45 jforrester@deploy2002: dani, jforrester: Backport for [[gerrit:1249983{{!}}Deploy participant recruitment survey on ptwiki and trwiki (T419275)]], [[gerrit:1238733{{!}}wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402)]], [[gerrit:1238734{{!}}wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403)]], [[gerrit:1249393{{!}}build: Upgrade mediawiki-phan-config from 0.18.0 to 0.20.0 (T41}}
* 20:43 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7006.magru.wmnet with OS trixie
* {{safesubst:SAL entry|1=20:43 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1249983{{!}}Deploy participant recruitment survey on ptwiki and trwiki (T419275)]], [[gerrit:1238733{{!}}wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402)]], [[gerrit:1238734{{!}}wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403)]], [[gerrit:1249393{{!}}build: Upgrade mediawiki-phan-config from 0.18.0 to 0.20}}
* 20:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7006.magru.wmnet with OS trixie
* 20:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - cdobbins@cumin2002"
* 20:38 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240012{{!}}Enable personal main menu to all users in Minerva Neue skin (T413912)]], [[gerrit:1250007{{!}}Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592)]], [[gerrit:1250015{{!}}Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439)]] (duration: 12m 58s)
* 20:36 cdobbins@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - cdobbins@cumin2002"
* 20:34 jforrester@deploy2002: jforrester, cscott, bwang: Continuing with sync
* 20:27 jforrester@deploy2002: jforrester, cscott, bwang: Backport for [[gerrit:1240012{{!}}Enable personal main menu to all users in Minerva Neue skin (T413912)]], [[gerrit:1250007{{!}}Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592)]], [[gerrit:1250015{{!}}Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439)]] synced to the testservers (see https://wikitech.wi
* 20:25 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1240012{{!}}Enable personal main menu to all users in Minerva Neue skin (T413912)]], [[gerrit:1250007{{!}}Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592)]], [[gerrit:1250015{{!}}Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439)]]
* 20:25 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7007.magru.wmnet with OS trixie
* 20:24 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7007.magru.wmnet [reason: trixie reimaging]
* 20:24 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7005.magru.wmnet [reason: trixie reimaging]
* 20:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7005.magru.wmnet with OS trixie
* 20:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 20:03 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7013.*
* 20:03 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 19:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7013.magru.wmnet with OS trixie
* 19:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7005.magru.wmnet with reason: host reimage
* 19:42 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7005.magru.wmnet with reason: host reimage
* 19:40 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7006.magru.wmnet with OS trixie
* 19:40 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7006.magru.wmnet [reason: trixie reimaging]
* 19:39 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7004.magru.wmnet [reason: trixie reimaging]
* 19:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7013.magru.wmnet with reason: host reimage
* 19:19 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7005.magru.wmnet with OS trixie
* 19:19 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7004.magru.wmnet with OS trixie
* 19:19 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7005.magru.wmnet [reason: trixie reimaging]
* 19:18 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7013.magru.wmnet with reason: host reimage
* 19:17 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
* 19:16 brennen@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 19:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7003.magru.wmnet with OS trixie
* 19:09 brennen: 1.46.0-wmf.19 train status: blockers believed resolved, rolling to group0
* 19:07 brennen@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249937{{!}}Re-add correct namespace for translatable pages (T419294)]] (duration: 12m 30s)
* 19:06 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 19:01 brennen@deploy2002: abi, brennen: Continuing with sync
* 18:58 brennen@deploy2002: abi, brennen: Backport for [[gerrit:1249937{{!}}Re-add correct namespace for translatable pages (T419294)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:55 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7013.magru.wmnet with OS trixie
* 18:54 brennen@deploy2002: Started scap sync-world: Backport for [[gerrit:1249937{{!}}Re-add correct namespace for translatable pages (T419294)]]
* 18:52 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7004.magru.wmnet with reason: host reimage
* 18:52 brennen@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.19 refs [[phab:T413810|T413810]] (duration: 38m 34s)
* 18:49 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7004.magru.wmnet with reason: host reimage
* 18:47 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7003.magru.wmnet with reason: host reimage
* 18:44 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7003.magru.wmnet with reason: host reimage
* 18:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7015.*
* 18:27 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7015.magru.wmnet with OS trixie
* 18:23 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7004.magru.wmnet with OS trixie
* 18:21 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7004.magru.wmnet [reason: trixie reimaging]
* 18:16 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7003.magru.wmnet with OS trixie
* 18:13 brennen@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
* 18:00 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:59 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7015.magru.wmnet with reason: host reimage
* 17:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 17:59 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 17:56 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7015.magru.wmnet with reason: host reimage
* 17:54 hashar@deploy2002: Finished deploy [integration/docroot@f544f49]: Catch up with composer/npm dev dependencies. Noop for production (duration: 00m 11s)
* 17:54 hashar@deploy2002: Started deploy [integration/docroot@f544f49]: Catch up with composer/npm dev dependencies. Noop for production
* 17:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 17:43 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 17:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:32 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:31 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:30 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7015.magru.wmnet with OS trixie
* 17:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:29 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:28 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:28 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 17:28 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:26 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:23 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 17:22 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:21 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:12 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:12 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:11 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:11 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:09 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:09 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:01 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 16:40 andrew@dns1004: END - running authdns-update
* 16:38 andrew@dns1004: START - running authdns-update
* 16:25 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249993{{!}}Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled"]] (duration: 07m 45s)
* 16:21 reedy@deploy2002: reedy: Continuing with sync
* 16:19 reedy@deploy2002: reedy: Backport for [[gerrit:1249993{{!}}Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:17 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1249993{{!}}Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled"]]
* 15:59 jynus@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 15:59 taavi: update cr firewall policy for codfw1dev ldap tree https://gerrit.wikimedia.org/r/1249985
* 15:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-fr-tech: apply
* 15:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-fr-tech: apply
* 15:55 jynus@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 15:48 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:28 brouberol@dns1004: END - running authdns-update
* 15:27 brouberol@dns1004: START - running authdns-update
* 15:10 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002"
* 15:10 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002
* 15:09 swfrench@cumin2002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002
* 15:09 swfrench@cumin2002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002"
* 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:59 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:58 sukhe: sudo cumin -b1 -s15 "C:bird" "run-puppet-agent --enable 'merging CR 1238007; add function return type'"
* 14:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:58 sukhe: sudo cumin -b1 -s15 "C:bird" "run-puppet-agent 'merging CR 1238007; add function return type'"
* 14:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:42 sukhe: sudo cumin "C:bird" "disable-puppet 'merging CR 1238007; add function return type'"
* 14:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host ml-serve1014
* 14:39 elukey@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:36 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host ml-serve1014
* 14:36 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.powercycle (exit_code=99) for host ml-serve1014
* 14:36 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host ml-serve1014
* 14:12 otto@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249367{{!}}stream: mediawiki.page_edit_type_simple.dev0 (T351225)]] (duration: 11m 05s)
* 14:08 otto@deploy2002: akhatun, otto: Continuing with sync
* 14:02 otto@deploy2002: akhatun, otto: Backport for [[gerrit:1249367{{!}}stream: mediawiki.page_edit_type_simple.dev0 (T351225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:01 otto@deploy2002: Started scap sync-world: Backport for [[gerrit:1249367{{!}}stream: mediawiki.page_edit_type_simple.dev0 (T351225)]]
* 13:49 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 13:43 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:28 vgutierrez: testing acme-chief 0.39 in acmechief-test2001 - [[phab:T419352|T419352]]
* 13:27 vgutierrez: upload acme-chief 0.39 to bookworm-wikimedia (apt.wm.o) - [[phab:T419352|T419352]]
* 13:16 jiji@cumin1003: END (FAIL) - Cookbook sre.memcached.roll-reboot-restart (exit_code=1) rolling restart_daemons on A:memcached-canary
* 13:16 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling restart_daemons on A:memcached-canary
* 13:12 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249903{{!}}Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580)]], [[gerrit:1249035{{!}}kaiwiki: add logo, stiename, projectnamespace and timezone (T414237)]] (duration: 08m 45s)
* 13:08 mszwarc@deploy2002: mszwarc, anzx: Continuing with sync
* 13:05 mszwarc@deploy2002: mszwarc, anzx: Backport for [[gerrit:1249903{{!}}Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580)]], [[gerrit:1249035{{!}}kaiwiki: add logo, stiename, projectnamespace and timezone (T414237)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1249903{{!}}Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580)]], [[gerrit:1249035{{!}}kaiwiki: add logo, stiename, projectnamespace and timezone (T414237)]]
* 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 12:57 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1015.eqiad.wmnet with OS bookworm
* 12:56 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 12:51 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1014.eqiad.wmnet with OS bookworm
* 12:50 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ml-serve1014
* 12:50 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ml-serve1014
* 12:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:49 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:49 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:47 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:45 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:44 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:42 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling restart_daemons on A:memcached-canary
* 12:42 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling restart_daemons on A:memcached-canary
* 12:31 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 12:10 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 12:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 11:59 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 11:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe2024.codfw.wmnet with OS bullseye
* 11:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1003"
* 11:17 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1003"
* 11:15 Emperor: rebalance codfw swift rings [[phab:T354872|T354872]]
* 10:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe2024.codfw.wmnet with reason: host reimage
* 10:47 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe2024.codfw.wmnet with reason: host reimage
* 10:31 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host ms-fe2024.codfw.wmnet with OS bullseye
* 10:30 ayounsi@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-fe2024.codfw.wmnet with OS bullseye
* 10:20 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host ms-fe2024.codfw.wmnet with OS bullseye
* 10:17 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cr2-eqdfw
* 09:31 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device cr2-eqdfw
* 09:22 derick@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=loginwiki --logwiki=metawiki TMPRI1975 FondueFanatic # [[phab:T419499|T419499]]
* 09:00 arnaudb@dns1005: END - running authdns-update
* 09:00 godog: restore all host interfaces - [[phab:T417393|T417393]]
* 08:58 arnaudb@dns1005: START - running authdns-update
* 08:30 godog: disabled interface for cloudcephmon1004 - [[phab:T417393|T417393]]
* 08:22 godog: disabled interfaces for cloudcephosd1021 cloudcephosd1042 cloudcephosd1043 cloudcephosd1018 cloudcephosd1022 - [[phab:T417393|T417393]]
* 08:18 godog: disabled interfaces for cloudcephosd1016 cloudcephosd1017 cloudcephosd1016 cloudcephosd1018 cloudcephosd1017 cloudcephosd1035 - [[phab:T417393|T417393]]
* 08:05 godog: start disabling cloudcephosd interfaces - [[phab:T417393|T417393]]
* 07:49 godog: prep cloudsw reboot tests 'ceph osd set noout' - [[phab:T417393|T417393]]
* 07:41 filippo@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 19 hosts with reason: switch down tests
* 06:14 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs2009.codfw.wmnet with OS bookworm
* 04:09 pt1979@cumin2002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device asw1-23-ulsfo
* 04:08 pt1979@cumin2002: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.16 (duration: 01m 48s)
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 10s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:37 ryankemper: [WDQS] [[phab:T410573|T410573]] repooled wdqs1011.eqiad.wmnet - erroneously depooled since `2025-11-19` by failed `sre.wdqs.reboot` cookbook
* 00:42 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 00:39 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
== 2026-03-09 ==
* 22:51 rzl: root@apt1002:~# reprepro --noskipold --restrict vopsbot update bookworm-wikimedia
* 22:34 bking@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM dse-k8s-ctrl1001.eqiad.wmnet
* 22:32 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1001.eqiad.wmnet
* 22:30 bking@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:29 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:03 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 22:02 alexsanford: Redeployed security fix for [[phab:T419186|T419186]]
* 21:44 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:40 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:37 cdobbins@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7002.magru.wmnet
* 21:34 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7002.magru.wmnet with OS trixie
* 21:29 alexsanford: Deployed security fix for [[phab:T419186|T419186]]
* 21:22 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 21:21 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 21:17 dani@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249370{{!}}Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]] (duration: 08m 15s)
* 21:13 dani@deploy2002: dani: Continuing with sync
* 21:11 dani@deploy2002: dani: Backport for [[gerrit:1249370{{!}}Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:09 dani@deploy2002: Started scap sync-world: Backport for [[gerrit:1249370{{!}}Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]]
* 21:08 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:05 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cp7002.magru.wmnet with reason: host reimage
* 21:02 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7002.magru.wmnet with reason: host reimage
* 21:01 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:01 tgr_: removed private code for [[phab:T397244|T397244]]
* 21:01 ryankemper: [WDQS] Alright, these are re-entering a failed state soon enough that we will need to identify the offender if we want to restore proper service. We could put some temporary hack to restart every few minutes so we at least maintain some uptime, but root cause is the usual 'we need a requestctl rule to block whoever's killing us' scenario
* 21:00 cdobbins@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7001.magru.wmnet [reason: Trixie reimaging]
* 20:57 ryankemper: [WDQS] Auto-remediation would have eventually restarted these, but some of them were staying below our current threshold of `threads > 1200`. May want to lower threshold, or examine an additional metric-type to look at in the future
* 20:56 ryankemper: [WDQS] `ryankemper@cumin2002:~$ sudo -E cumin 'A:wdqs-main AND P<nowiki>{</nowiki>wdqs1*<nowiki>}</nowiki>' 'systemctl restart wdqs-blazegraph'`
* 20:54 ryankemper: [WDQS] `ryankemper@cumin2002:~$ sudo -E cumin 'A:wdqs-main AND P<nowiki>{</nowiki>wdqs2*<nowiki>}</nowiki>' 'systemctl restart wdqs-blazegraph'`
* 20:44 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 20:43 tgr@deploy2002: Unlocked for deployment [MediaWiki]: working on private change (duration: 10m 10s)
* 20:36 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7002.magru.wmnet with OS trixie
* 20:33 tgr@deploy2002: Locking from deployment [MediaWiki]: working on private change
* 20:31 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247119{{!}}Enable parser survey for opted-out users on German/French/Polish wikis (T414852)]], [[gerrit:1249316{{!}}lift IP cap for womens month editathon (T419109)]] (duration: 13m 36s)
* 20:27 tgr@deploy2002: cscott, tgr, anzx: Continuing with sync
* 20:19 tgr@deploy2002: cscott, tgr, anzx: Backport for [[gerrit:1247119{{!}}Enable parser survey for opted-out users on German/French/Polish wikis (T414852)]], [[gerrit:1249316{{!}}lift IP cap for womens month editathon (T419109)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:17 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1247119{{!}}Enable parser survey for opted-out users on German/French/Polish wikis (T414852)]], [[gerrit:1249316{{!}}lift IP cap for womens month editathon (T419109)]]
* 20:13 aaron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249363{{!}}Remove redundant math spec file from wwwportal (T418188)]] (duration: 06m 56s)
* 20:09 aaron@deploy2002: aaron: Continuing with sync
* 20:08 aaron@deploy2002: aaron: Backport for [[gerrit:1249363{{!}}Remove redundant math spec file from wwwportal (T418188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:06 aaron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249363{{!}}Remove redundant math spec file from wwwportal (T418188)]]
* 20:01 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7016.*
* 19:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7001.magru.wmnet with OS trixie
* 19:51 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7016.magru.wmnet with OS trixie
* 19:49 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248911{{!}}Stop writing to il_to on commonswiki (T415787)]] (duration: 06m 04s)
* 19:45 zabe@deploy2002: zabe: Continuing with sync
* 19:44 zabe@deploy2002: zabe: Backport for [[gerrit:1248911{{!}}Stop writing to il_to on commonswiki (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:43 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248911{{!}}Stop writing to il_to on commonswiki (T415787)]]
* 19:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 19:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 19:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7001.magru.wmnet with reason: host reimage
* 19:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7016.magru.wmnet with reason: host reimage
* 19:23 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7001.magru.wmnet with reason: host reimage
* 19:19 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7016.magru.wmnet with reason: host reimage
* 19:15 cwhite@deploy2002: Finished deploy [performance/arc-lamp@aa8da8b]: {{Gerrit|Ie7e0355f89294a2927f9dbc28afec3a62d1752de}} (duration: 00m 08s)
* 19:15 cwhite@deploy2002: Started deploy [performance/arc-lamp@aa8da8b]: {{Gerrit|Ie7e0355f89294a2927f9dbc28afec3a62d1752de}}
* 19:14 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 19:14 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 19:05 herron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249365{{!}}udp2log: switch to new hosts]] (duration: 09m 38s)
* 19:03 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:03 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:01 herron@deploy2002: herron: Continuing with sync
* 19:00 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:00 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 18:59 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 18:59 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 18:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 18:59 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 18:57 herron@deploy2002: herron: Backport for [[gerrit:1249365{{!}}udp2log: switch to new hosts]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:57 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7001.magru.wmnet with OS trixie
* 18:55 herron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249365{{!}}udp2log: switch to new hosts]]
* 18:55 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7016.magru.wmnet with OS trixie
* 18:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 18:49 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 18:44 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 18:44 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 18:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:34 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:33 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 18:32 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 18:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:29 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:27 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:23 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 18:23 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 18:23 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 18:16 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 18:16 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 18:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1001.eqiad.wmnet
* 18:05 herron@deploy2002: Sync cancelled.
* 18:04 herron@deploy2002: herron: Backport for [[gerrit:1249361{{!}}Revert "udp2log: switch to new hosts"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:02 herron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249361{{!}}Revert "udp2log: switch to new hosts"]]
* 18:01 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1001.eqiad.wmnet
* 17:54 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:47 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:42 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:42 herron@deploy2002: Sync cancelled.
* 17:40 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:39 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:38 mutante: contint1003 - unable to get uptime Caused by: Cumin execution failed (exit_code=2) [101/240] - attempted manual powercycle - Initializing Firmware Interfaces... blank screen [[phab:T418544|T418544]]
* 17:34 mutante: contint1003.mgmt - racadm serveraction powercycle [[phab:T418544|T418544]] - not reacting
* 17:25 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:25 herron@deploy2002: herron: Backport for [[gerrit:1249332{{!}}udp2log: switch to new hosts (T417002)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:23 herron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249332{{!}}udp2log: switch to new hosts (T417002)]]
* 17:19 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host netflow4003.ulsfo.wmnet
* 17:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host netflow4003.ulsfo.wmnet with OS bookworm
* 17:13 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:08 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 17:03 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 17:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 17:00 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis kaiwiki in section s5
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow4003.ulsfo.wmnet with reason: host reimage
* 16:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on netflow4003.ulsfo.wmnet with reason: host reimage
* 16:37 moritzm: installing gnupg security updates
* 16:31 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host netflow4003.ulsfo.wmnet with OS bookworm
* 16:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netflow4003.ulsfo.wmnet on all recursors
* 16:30 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netflow4003.ulsfo.wmnet on all recursors
* 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:26 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host netflow4003.ulsfo.wmnet
* 16:26 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 15:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on prometheus4003.ulsfo.wmnet with reason: host reimage
* 15:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on prometheus4003.ulsfo.wmnet with reason: host reimage
* 15:44 vgutierrez: vgutierrez@acmechief-test2001:~$ sudo -i systemctl disable reload-acme-chief-backend.timer - [[phab:T419352|T419352]]
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 15:37 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 15:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 15:30 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 15:26 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 15:24 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 15:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
* 15:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
* 15:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 15:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
* 15:08 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 15:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
* 14:50 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bookworm
* 14:49 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs2009.codfw.wmnet with OS bullseye
* 14:45 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 14:35 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249291{{!}}Hide 2fa-warning Echo category from preferences (T419111)]] (duration: 06m 07s)
* 14:35 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis kaiwiki in section s5
* 14:34 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitize-wiki (exit_code=99) Managing sanitization for wikis urwikisource in section s5
* 14:31 mszwarc@deploy2002: mszwarc: Continuing with sync
* 14:31 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1249291{{!}}Hide 2fa-warning Echo category from preferences (T419111)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:30 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
* 14:29 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1249291{{!}}Hide 2fa-warning Echo category from preferences (T419111)]]
* 14:25 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis urwikisource in section s5
* 14:22 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis urwikisource in section s5
* 14:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 14:15 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 14:15 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249243{{!}}JS SDK: Add getExperimentByPrefix() (T419191)]], [[gerrit:1249242{{!}}ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191)]] (duration: 09m 39s)
* 14:11 phuedx@deploy2002: phuedx: Continuing with sync
* 14:07 phuedx@deploy2002: phuedx: Backport for [[gerrit:1249243{{!}}JS SDK: Add getExperimentByPrefix() (T419191)]], [[gerrit:1249242{{!}}ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:05 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1249243{{!}}JS SDK: Add getExperimentByPrefix() (T419191)]], [[gerrit:1249242{{!}}ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191)]]
* 14:03 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 13:54 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bullseye
* 13:50 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249262{{!}}Disable MetricsPlatform extension (T416865)]] (duration: 08m 02s)
* 13:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 13:46 phuedx@deploy2002: phuedx, sfaci: Continuing with sync
* 13:44 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:44 phuedx@deploy2002: phuedx, sfaci: Backport for [[gerrit:1249262{{!}}Disable MetricsPlatform extension (T416865)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1249262{{!}}Disable MetricsPlatform extension (T416865)]]
* 13:39 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248075{{!}}Confirmemail: Log delay between email sent and confirmation (T415902)]], [[gerrit:1247651{{!}}Enable confirmemail logstash channel (T415902)]] (duration: 11m 16s)
* 13:35 phuedx@deploy2002: mmartorana, phuedx: Continuing with sync
* 13:30 phuedx@deploy2002: mmartorana, phuedx: Backport for [[gerrit:1248075{{!}}Confirmemail: Log delay between email sent and confirmation (T415902)]], [[gerrit:1247651{{!}}Enable confirmemail logstash channel (T415902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1248075{{!}}Confirmemail: Log delay between email sent and confirmation (T415902)]], [[gerrit:1247651{{!}}Enable confirmemail logstash channel (T415902)]]
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 13:04 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 12:55 moritzm: installing Kerberos security updates
* 12:29 moritzm: installing python3.9 security updates
* 12:11 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 12:00 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1239026{{!}}Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544)]], [[gerrit:1249253{{!}}CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled]] (duration: 06m 13s)
* 11:56 reedy@deploy2002: reedy: Continuing with sync
* 11:56 reedy@deploy2002: reedy: Backport for [[gerrit:1239026{{!}}Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544)]], [[gerrit:1249253{{!}}CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:54 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1239026{{!}}Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544)]], [[gerrit:1249253{{!}}CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled]]
* 11:44 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249245{{!}}Hooks: Really only add global logging context for pageviews]] (duration: 12m 02s)
* 11:38 phuedx@deploy2002: phuedx: Continuing with sync
* 11:34 phuedx@deploy2002: phuedx: Backport for [[gerrit:1249245{{!}}Hooks: Really only add global logging context for pageviews]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:32 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1249245{{!}}Hooks: Really only add global logging context for pageviews]]
* 11:29 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 11:29 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 11:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 11:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:50 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:49 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:45 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus4003.ulsfo.wmnet
* 10:45 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:45 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:44 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus4003.ulsfo.wmnet on all recursors
* 10:43 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache prometheus4003.ulsfo.wmnet on all recursors
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:40 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:39 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:39 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host prometheus4003.ulsfo.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:33 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:17 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 10:12 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:51 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 09:46 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:43 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:40 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus4003.ulsfo.wmnet
* 09:40 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host prometheus4003.ulsfo.wmnet
* 09:31 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host frdb1008
* 09:31 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host frdb1008
* 09:29 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 09:05 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 08:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 08:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 08:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 08:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:25 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 08:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 08:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 08:21 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 08:16 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:07 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo and group 1
* 08:07 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo and group 1
* 07:37 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248806{{!}}Add a script to send mandatory 2FA Echo notification (T419111)]], [[gerrit:1248821{{!}}Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880)]] (duration: 34m 41s)
* 07:23 mszwarc@deploy2002: mszwarc: Continuing with sync
* 07:22 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1248806{{!}}Add a script to send mandatory 2FA Echo notification (T419111)]], [[gerrit:1248821{{!}}Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:02 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1248806{{!}}Add a script to send mandatory 2FA Echo notification (T419111)]], [[gerrit:1248821{{!}}Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 58s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-08 ==
* 20:28 vgutierrez@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on acmechief-test2001.codfw.wmnet with reason: GTS issues
* 02:01 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 00m 59s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-07 ==
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 23s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:20 krinkle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248952{{!}}CSP: restore toolforge/wmcs entry in false positive list]] (duration: 10m 46s)
* 01:16 krinkle@deploy2002: krinkle: Continuing with sync
* 01:11 krinkle@deploy2002: krinkle: Backport for [[gerrit:1248952{{!}}CSP: restore toolforge/wmcs entry in false positive list]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:09 krinkle@deploy2002: Started scap sync-world: Backport for [[gerrit:1248952{{!}}CSP: restore toolforge/wmcs entry in false positive list]]
* 00:22 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
* 00:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp2043.codfw.wmnet
* 00:05 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp2043.codfw.wmnet
== 2026-03-06 ==
* 23:29 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs2009.codfw.wmnet with OS bullseye
* 23:13 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 23:07 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs2009
* 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2009
* 22:46 ryankemper@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2009
* 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs2009.codfw.wmnet 141.0.192.10.in-addr.arpa 1.4.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:46 ryankemper@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs2009.codfw.wmnet 141.0.192.10.in-addr.arpa 1.4.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:45 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:45 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs2009 - ryankemper@cumin2002"
* 22:45 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs2009 - ryankemper@cumin2002"
* 22:41 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
* 22:40 ryankemper@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs2009
* 22:39 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bullseye
* 19:48 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:47 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:47 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:46 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host wdqs2009.codfw.wmnet
* 19:23 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2009.codfw.wmnet
* 19:17 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on wdqs2009.codfw.wmnet with reason: NFS might be hung, about to reboot
* 18:56 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2043.codfw.wmnet with reason: troubleshooting for network drops
* 18:44 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp2043.*
* 18:29 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts an-backup-datanode1033.eqiad.wmnet
* 18:29 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-backup-datanode1033.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
* 18:28 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-backup-datanode1033.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
* 17:59 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248858{{!}}cirrus: Use https for semanticsearch-test cluster]] (duration: 11m 20s)
* 17:53 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 17:52 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1248858{{!}}cirrus: Use https for semanticsearch-test cluster]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:51 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:51 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:47 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1248858{{!}}cirrus: Use https for semanticsearch-test cluster]]
* 17:42 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:42 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:40 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:40 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:11 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:11 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:10 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 17:05 hashar@deploy2002: Finished deploy [gerrit/gerrit@b8183ba]: wm-checks-api: add tooltip to the CheckRun Run action (duration: 00m 13s)
* 17:05 hashar@deploy2002: Started deploy [gerrit/gerrit@b8183ba]: wm-checks-api: add tooltip to the CheckRun Run action
* 17:04 btullis@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-backup-datanode1033.eqiad.wmnet
* 16:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 16:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 16:23 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 16:23 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 15:57 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:57 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:56 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:56 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:52 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2354-2356].codfw.wmnet
* 15:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2354-2356].codfw.wmnet
* 15:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2356.codfw.wmnet with OS trixie
* 15:46 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2355.codfw.wmnet with OS trixie
* 15:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2354.codfw.wmnet with OS trixie
* 15:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2356.codfw.wmnet with reason: host reimage
* 15:31 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 15:30 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 15:28 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 15:28 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 15:28 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 15:26 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 15:26 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 15:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2355.codfw.wmnet with reason: host reimage
* 15:24 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 15:23 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 15:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2354.codfw.wmnet with reason: host reimage
* 15:19 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 15:19 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 15:17 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 15:17 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 15:17 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2356.codfw.wmnet with reason: host reimage
* 15:16 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2355.codfw.wmnet with reason: host reimage
* 15:16 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2354.codfw.wmnet with reason: host reimage
* 15:15 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 15:10 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 15:09 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 15:08 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 15:08 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 15:06 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 15:05 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 15:05 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 15:05 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2356.codfw.wmnet with OS trixie
* 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2355.codfw.wmnet with OS trixie
* 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2354.codfw.wmnet with OS trixie
* 15:02 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 15:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2348-2353].codfw.wmnet
* 15:02 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 15:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2348-2353].codfw.wmnet
* 14:59 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2353.codfw.wmnet with OS trixie
* 14:57 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:57 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:56 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
* 14:53 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:52 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 14:52 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2351.codfw.wmnet with OS trixie
* 14:49 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 14:48 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2352.codfw.wmnet with OS trixie
* 14:48 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 14:48 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 14:48 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:47 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 14:45 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2350.codfw.wmnet with OS trixie
* 14:44 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 14:43 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:43 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 14:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:41 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2353.codfw.wmnet with reason: host reimage
* 14:37 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2349.codfw.wmnet with reason: host reimage
* 14:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2351.codfw.wmnet with reason: host reimage
* 14:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2352.codfw.wmnet with reason: host reimage
* 14:29 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:28 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2350.codfw.wmnet with reason: host reimage
* 14:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2348.codfw.wmnet with reason: host reimage
* 14:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2351.codfw.wmnet with reason: host reimage
* 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2352.codfw.wmnet with reason: host reimage
* 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2353.codfw.wmnet with reason: host reimage
* 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2350.codfw.wmnet with reason: host reimage
* 14:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2349.codfw.wmnet with reason: host reimage
* 14:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2348.codfw.wmnet with reason: host reimage
* 14:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2353.codfw.wmnet with OS trixie
* 14:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2352.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2351.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2350.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2347].codfw.wmnet
* 14:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2347].codfw.wmnet
* 14:01 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2347.codfw.wmnet with OS trixie
* 13:57 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2346.codfw.wmnet with OS trixie
* 13:55 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2343.codfw.wmnet with OS trixie
* 13:50 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2345.codfw.wmnet with OS trixie
* 13:48 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2344.codfw.wmnet with OS trixie
* 13:45 dreamyjazz@deploy2002: mwscript-k8s job started: foreachwikiindblist checkuser-suggested-investigations CheckUser:queueAutoCloseSICases.php # [[phab:T418591|T418591]]
* 13:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2342.codfw.wmnet with OS trixie
* 13:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2347.codfw.wmnet with reason: host reimage
* 13:38 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2346.codfw.wmnet with reason: host reimage
* 13:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2343.codfw.wmnet with reason: host reimage
* 13:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2345.codfw.wmnet with reason: host reimage
* 13:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2344.codfw.wmnet with reason: host reimage
* 13:24 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2342.codfw.wmnet with reason: host reimage
* 13:21 Dreamy_Jazz: Running foreachwikiindblist checkuser-suggested-investigations.dblist ~/PopulateSiuInfo.php --batch-size=1000 for [[phab:T411118|T411118]]
* 13:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2347.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2346.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2345.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2344.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2343.codfw.wmnet with reason: host reimage
* 13:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2342.codfw.wmnet with reason: host reimage
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2347.codfw.wmnet with OS trixie
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2346.codfw.wmnet with OS trixie
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2345.codfw.wmnet with OS trixie
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2344.codfw.wmnet with OS trixie
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2343.codfw.wmnet with OS trixie
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2342.codfw.wmnet with OS trixie
* 13:05 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2336-2341].codfw.wmnet
* 13:05 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2336-2341].codfw.wmnet
* 13:01 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2341.codfw.wmnet with OS trixie
* 12:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2340.codfw.wmnet with OS trixie
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2337.codfw.wmnet with OS trixie
* 12:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2338.codfw.wmnet with OS trixie
* 12:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2336.codfw.wmnet with OS trixie
* 12:40 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2341.codfw.wmnet with reason: host reimage
* 12:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2339.codfw.wmnet with OS trixie
* 12:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2340.codfw.wmnet with reason: host reimage
* 12:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2337.codfw.wmnet with reason: host reimage
* 12:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2338.codfw.wmnet with reason: host reimage
* 12:22 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2336.codfw.wmnet with reason: host reimage
* 12:18 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2339.codfw.wmnet with reason: host reimage
* 12:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2340.codfw.wmnet with reason: host reimage
* 12:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2341.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2337.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2338.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2336.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2339.codfw.wmnet with reason: host reimage
* 12:00 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2341.codfw.wmnet with OS trixie
* 12:00 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2340.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2339.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2338.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2337.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2336.codfw.wmnet with OS trixie
* 11:56 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2333-2335].codfw.wmnet
* 11:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2333-2335].codfw.wmnet
* 11:55 elukey@cumin1003: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 11:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1207.eqiad.wmnet
* 11:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2335.codfw.wmnet with OS trixie
* 11:53 moritzm: uploaded icu 72.1-3+deb12u1~wmf11u1 to component/php83-icu72 [[phab:T419058|T419058]] (backport of ICU 72 from Bookworm to Bullseye, built to be co-installable with the native ICU from Bullseye)
* 11:50 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2334.codfw.wmnet with OS trixie
* 11:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1207.eqiad.wmnet
* 11:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1205.eqiad.wmnet
* 11:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2333.codfw.wmnet with OS trixie
* 11:39 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 11:39 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1205.eqiad.wmnet
* 11:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2335.codfw.wmnet with reason: host reimage
* 11:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2334.codfw.wmnet with reason: host reimage
* 11:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2333.codfw.wmnet with reason: host reimage
* 11:23 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2335.codfw.wmnet with reason: host reimage
* 11:22 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2334.codfw.wmnet with reason: host reimage
* 11:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2333.codfw.wmnet with reason: host reimage
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2334.codfw.wmnet with OS trixie
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2335.codfw.wmnet with OS trixie
* 11:08 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2333.codfw.wmnet with OS trixie
* 11:06 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2332.codfw.wmnet
* 11:05 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2332.codfw.wmnet
* 11:02 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2332.codfw.wmnet with OS trixie
* 10:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2332.codfw.wmnet with reason: host reimage
* 10:36 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2332.codfw.wmnet with reason: host reimage
* 10:23 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2332.codfw.wmnet with OS trixie
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1199.eqiad.wmnet
* 10:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1194.eqiad.wmnet
* 10:16 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2356].codfw.wmnet
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1194.eqiad.wmnet
* 10:09 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
* 10:09 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2356].codfw.wmnet
* 09:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:39 Emperor: repool ms-fe1013 after PXE work [[phab:T401966|T401966]]
* 09:23 derick@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=pmswiki --logwiki=metawiki Wikilimes Limes.pink # [[phab:T419184|T419184]]
* 09:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:09 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:08 elukey@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 09:08 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe1013.eqiad.wmnet
* 08:57 elukey@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-fe1013.eqiad.wmnet
* 08:56 elukey@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 08:54 elukey@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 08:42 elukey@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 08:25 moritzm: uploaded openjdk-8 8u482-ga-1~deb12u1 to component/jdk8 of bookworm-wikimedia
* 08:11 moritzm: imported prometheus-ganeti-exporter 0.3+deb12u2 for bookworm-wikimedia [[phab:T419166|T419166]]
* 06:23 ryankemper@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
* 06:23 ryankemper@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
* 06:23 ryankemper@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
* 06:23 ryankemper@deploy2002: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
* 06:22 ryankemper@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
* 06:22 ryankemper@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
* 02:59 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:59 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 02:59 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 02:56 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 02:21 zabe: zabe@deploy2002:/srv/mediawiki-staging$ foreachwiki extensions/TimedMediaHandler/maintenance/migrateTranscodeStates.php --force # [[phab:T415064|T415064]]
* 02:16 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248658{{!}}Update interwiki cache]] (duration: 06m 38s)
* 02:12 zabe@deploy2002: mwscript-k8s job started: foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https # [[phab:T415978|T415978]], [[phab:T414241|T414241]]
* 02:12 zabe@deploy2002: zabe: Continuing with sync
* 02:11 zabe@deploy2002: zabe: Backport for [[gerrit:1248658{{!}}Update interwiki cache]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 02:09 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248658{{!}}Update interwiki cache]]
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 23s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:59 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248656{{!}}Set urwikisource to rtl (T415960)]] (duration: 06m 39s)
* 01:55 zabe@deploy2002: zabe: Continuing with sync
* 01:54 zabe@deploy2002: zabe: Backport for [[gerrit:1248656{{!}}Set urwikisource to rtl (T415960)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:53 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248656{{!}}Set urwikisource to rtl (T415960)]]
* 01:45 zabe@deploy2002: Sync cancelled.
* 01:43 zabe@deploy2002: zabe: Backport for [[gerrit:1248653{{!}}Activate urwikisource (T415960)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:42 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248653{{!}}Activate urwikisource (T415960)]]
* 01:38 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248652{{!}}Prepare urwikisource (T415960)]] (duration: 06m 18s)
* 01:34 zabe@deploy2002: zabe: Continuing with sync
* 01:34 zabe@deploy2002: zabe: Backport for [[gerrit:1248652{{!}}Prepare urwikisource (T415960)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:32 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248652{{!}}Prepare urwikisource (T415960)]]
* 01:29 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248650{{!}}Activate kaiwiki (T414234)]] (duration: 06m 57s)
* 01:25 zabe@deploy2002: zabe: Continuing with sync
* 01:24 zabe@deploy2002: zabe: Backport for [[gerrit:1248650{{!}}Activate kaiwiki (T414234)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:22 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248650{{!}}Activate kaiwiki (T414234)]]
* 01:17 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248647{{!}}Prepare kaiwiki (T414234)]] (duration: 07m 25s)
* 01:13 zabe@deploy2002: zabe: Continuing with sync
* 01:11 zabe@deploy2002: zabe: Backport for [[gerrit:1248647{{!}}Prepare kaiwiki (T414234)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:09 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248647{{!}}Prepare kaiwiki (T414234)]]
* 00:33 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248493{{!}}Stop writing to il_to on all wikis except commons (T415787)]] (duration: 06m 22s)
* 00:29 zabe@deploy2002: zabe: Continuing with sync
* 00:28 zabe@deploy2002: zabe: Backport for [[gerrit:1248493{{!}}Stop writing to il_to on all wikis except commons (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:27 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248493{{!}}Stop writing to il_to on all wikis except commons (T415787)]]
* 00:05 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248628{{!}}Re-enable AllowUserJs (T419137)]] (duration: 08m 08s)
* 00:01 catrope@deploy2002: catrope, kharlan: Continuing with sync
== 2026-03-05 ==
* 23:58 catrope@deploy2002: catrope, kharlan: Backport for [[gerrit:1248628{{!}}Re-enable AllowUserJs (T419137)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:56 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1248628{{!}}Re-enable AllowUserJs (T419137)]]
* 23:52 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248636{{!}}CSP: Update false positives list]] (duration: 06m 34s)
* 23:52 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2003.wikimedia.org with OS trixie
* 23:47 catrope@deploy2002: catrope: Continuing with sync
* 23:47 catrope@deploy2002: catrope: Backport for [[gerrit:1248636{{!}}CSP: Update false positives list]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:45 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1248636{{!}}CSP: Update false positives list]]
* 23:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint2003.wikimedia.org with reason: host reimage
* 23:29 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on contint2003.wikimedia.org with reason: host reimage
* 23:15 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1238028{{!}}Using Hadoop for MostTranscludedPages on commonswiki (T416927)]] (duration: 06m 27s)
* 23:11 zabe@deploy2002: zabe: Continuing with sync
* 23:10 zabe@deploy2002: zabe: Backport for [[gerrit:1238028{{!}}Using Hadoop for MostTranscludedPages on commonswiki (T416927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:09 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2003.wikimedia.org with OS trixie
* 23:08 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1238028{{!}}Using Hadoop for MostTranscludedPages on commonswiki (T416927)]]
* 22:45 maryum: Deployed security fix for [[phab:T418254|T418254]]
* 22:35 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248483{{!}}SpecialWantedFiles: Use lt_title instead of lt_to (T299953)]] (duration: 06m 12s)
* 22:31 zabe@deploy2002: zabe: Continuing with sync
* 22:30 zabe@deploy2002: zabe: Backport for [[gerrit:1248483{{!}}SpecialWantedFiles: Use lt_title instead of lt_to (T299953)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:28 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248483{{!}}SpecialWantedFiles: Use lt_title instead of lt_to (T299953)]]
* 21:43 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248508{{!}}cirrus: Align semanticsearch cluster group name with routing (T413969)]] (duration: 07m 20s)
* 21:39 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 21:38 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1248508{{!}}cirrus: Align semanticsearch cluster group name with routing (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:36 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1248508{{!}}cirrus: Align semanticsearch cluster group name with routing (T413969)]]
* 21:04 jhathaway@dns1004: END - running authdns-update
* 21:02 jhathaway@dns1004: START - running authdns-update
* 20:53 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:52 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new service IPs for sophroid - jasmine@cumin2002"
* 20:52 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new service IPs for sophroid - jasmine@cumin2002"
* 20:47 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 20:28 cdanis: apt built and imported jwt-authorizer 1.3.0-1
* 20:16 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 20:04 krinkle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248574{{!}}Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475)]] (duration: 07m 37s)
* 20:00 krinkle@deploy2002: krinkle: Continuing with sync
* 19:58 krinkle@deploy2002: krinkle: Backport for [[gerrit:1248574{{!}}Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:56 krinkle@deploy2002: Started scap sync-world: Backport for [[gerrit:1248574{{!}}Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475)]]
* 19:21 sbassett@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248571{{!}}Re-enable Site JS (T419137 T419138)]] (duration: 06m 57s)
* 19:17 sbassett@deploy2002: sbassett: Continuing with sync
* 19:16 sbassett@deploy2002: sbassett: Backport for [[gerrit:1248571{{!}}Re-enable Site JS (T419137 T419138)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:15 sbassett@deploy2002: Started scap sync-world: Backport for [[gerrit:1248571{{!}}Re-enable Site JS (T419137 T419138)]]
* 19:04 dr0ptp4kt: Deploying change {{Gerrit|1239200}} for refinery ( [[phab:T416481|T416481]] ) using scap, then deployed onto hdfs
* 19:03 dr0ptp4kt: Deployed refinery change {{Gerrit|1240253}} ( [[phab:T414478|T414478]] ), {{Gerrit|1240253}} (no-op) for refinery ( [[phab:T414478|T414478]] ) using scap, then deployed onto hdfs
* 18:58 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1] (thin): Regular analytics weekly train THIN [analytics/refinery@dd641b15] (duration: 02m 02s)
* 18:56 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1] (thin): Regular analytics weekly train THIN [analytics/refinery@dd641b15]
* 18:55 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1]: Regular analytics weekly train [analytics/refinery@dd641b15] (duration: 04m 18s)
* 18:50 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1]: Regular analytics weekly train [analytics/refinery@dd641b15]
* 18:49 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@dd641b15] (duration: 01m 57s)
* 18:47 dr0ptp4kt: Deploying change {{Gerrit|1239200}} for refinery ( [[phab:T416481|T416481]] )
* 18:47 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@dd641b15]
* 18:31 eevans@dns1004: END - running authdns-update
* 18:30 eevans@dns1004: START - running authdns-update
* 18:30 sukhe: sudo cumin -b51 "A:cp" "run-puppet-agent --enable 'rolling out 1248544'"
* 18:16 sukhe: sudo cumin "A:cp" "disable-puppet 'rolling out 1248544'"
* 18:06 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:06 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 18:06 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 18:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 17:31 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248536{{!}}Enable wgUseSiteJs on donatewiki (T419138)]] (duration: 09m 57s)
* 17:27 mszwarc@deploy2002: mszwarc, krinkle: Continuing with sync
* 17:26 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2003.wikimedia.org with OS bookworm
* 17:26 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:23 mszwarc@deploy2002: mszwarc, krinkle: Backport for [[gerrit:1248536{{!}}Enable wgUseSiteJs on donatewiki (T419138)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:21 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1248536{{!}}Enable wgUseSiteJs on donatewiki (T419138)]]
* 17:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1178.eqiad.wmnet
* 17:16 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:12 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1162.eqiad.wmnet
* 17:12 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1162.eqiad.wmnet
* 17:10 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker1162.eqiad.wmnet
* 17:10 cgoubert@cumin1003: START - Cookbook sre.hosts.remove-downtime for wikikube-worker1162.eqiad.wmnet
* 17:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 17:05 taavi@cumin1003: dbctl commit (dc=all): 'enable writes', diff saved to https://phabricator.wikimedia.org/P89812 and previous config saved to /var/cache/conftool/dbconfig/20260305-170556-taavi.json
* 16:03 oblivian@cumin1003: dbctl commit (dc=all): 'read only s6', diff saved to https://phabricator.wikimedia.org/P89810 and previous config saved to /var/cache/conftool/dbconfig/20260305-160348-oblivian.json
* 15:32 taavi@cumin1003: dbctl commit (dc=all): 'set global ro', diff saved to https://phabricator.wikimedia.org/P89808 and previous config saved to /var/cache/conftool/dbconfig/20260305-153203-taavi.json
* 15:31 mszwarc@deploy2002: mszwarc: Continuing with sync
* 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1178.eqiad.wmnet
* 15:31 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1248509{{!}}Disable custom JS for a moment]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:29 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1248509{{!}}Disable custom JS for a moment]]
* 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2003']
* 15:25 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2003']
* 15:23 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248506{{!}}cirrus: Correct semantic builder config (T413969)]] (duration: 07m 39s)
* 15:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:19 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 15:18 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1248506{{!}}cirrus: Correct semantic builder config (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:16 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1248506{{!}}cirrus: Correct semantic builder config (T413969)]]
* 15:11 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 15:10 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244713{{!}}cirrus: Add semantic search test cluster (T413969)]] (duration: 09m 18s)
* 15:06 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 15:04 sukhe@dns1004: END - running authdns-update
* 15:03 sukhe@dns1004: START - running authdns-update
* 15:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:02 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1244713{{!}}cirrus: Add semantic search test cluster (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:02 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
* 15:02 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
* 15:00 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1244713{{!}}cirrus: Add semantic search test cluster (T413969)]]
* 14:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:53 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 14:50 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 14:38 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
* 14:38 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
* 14:32 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
* 14:32 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
* 14:32 sukhe@dns1004: END - running authdns-update
* 14:30 sukhe@dns1004: START - running authdns-update
* 14:28 elukey@cumin1003: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 14:28 sukhe@dns1004: START - running authdns-update
* 14:27 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1231.eqiad.wmnet
* 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1230.eqiad.wmnet
* 14:24 bking@dns1004: START - running authdns-update
* 14:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1230.eqiad.wmnet
* 14:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1229.eqiad.wmnet
* 14:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 14:05 moritzm: imported nodejs 24.14.0-1nodesource1 to thirdparty/node24 [[phab:T418440|T418440]]
* 14:03 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1229.eqiad.wmnet
* 14:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1228.eqiad.wmnet
* 14:01 moritzm: initialised ganeti02/ulsfo cluster [[phab:T418993|T418993]]
* 13:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:53 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:52 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1228.eqiad.wmnet
* 13:52 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1227.eqiad.wmnet
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:46 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:43 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:42 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1199.eqiad.wmnet
* 13:40 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1227.eqiad.wmnet
* 13:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1226.eqiad.wmnet
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:35 moritzm: installing glib2.0 security updates
* 13:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:26 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1226.eqiad.wmnet
* 13:26 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1225.eqiad.wmnet
* 13:26 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1225.eqiad.wmnet
* 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1224.eqiad.wmnet
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4006.ulsfo.wmnet with OS bookworm
* 13:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:07 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new VIP for routed ganeti in ulsfo - jmm@cumin2002"
* 13:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new VIP for routed ganeti in ulsfo - jmm@cumin2002"
* 13:06 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:02 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1224.eqiad.wmnet
* 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1223.eqiad.wmnet
* 13:00 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:58 cgoubert@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on wikikube-worker1162.eqiad.wmnet with reason: dcops intervention
* 12:57 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1162.eqiad.wmnet
* 12:56 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1162.eqiad.wmnet
* 12:55 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 12:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1223.eqiad.wmnet
* 12:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1222.eqiad.wmnet
* 12:46 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 12:43 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 12:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1222.eqiad.wmnet
* 12:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1221.eqiad.wmnet
* 12:23 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1221.eqiad.wmnet
* 12:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1220.eqiad.wmnet
* 12:23 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4006.ulsfo.wmnet with OS bookworm
* 12:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1220.eqiad.wmnet
* 11:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4005.ulsfo.wmnet
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4005.ulsfo.wmnet
* 11:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1236.eqiad.wmnet
* 11:29 moritzm: remove ganeti4006 from ganeti/ulsfo cluster [[phab:T418993|T418993]]
* 11:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1236.eqiad.wmnet
* 11:25 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1235.eqiad.wmnet
* 11:16 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1235.eqiad.wmnet
* 11:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1234.eqiad.wmnet
* 11:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1234.eqiad.wmnet
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1233.eqiad.wmnet
* 11:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1233.eqiad.wmnet
* 11:02 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1232.eqiad.wmnet
* 11:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 11:00 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4005.ulsfo.wmnet with OS bookworm
* 10:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 10:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1232.eqiad.wmnet
* 10:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1231.eqiad.wmnet
* 10:54 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 10:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1231.eqiad.wmnet
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1230.eqiad.wmnet
* 10:41 elukey@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4005.ulsfo.wmnet with reason: host reimage
* 10:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1230.eqiad.wmnet
* 10:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1229.eqiad.wmnet
* 10:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4005.ulsfo.wmnet with reason: host reimage
* 10:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1229.eqiad.wmnet
* 10:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1228.eqiad.wmnet
* 10:24 moritzm: installing Java 8 security updates
* 10:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1228.eqiad.wmnet
* 10:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1227.eqiad.wmnet
* 10:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1227.eqiad.wmnet
* 10:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1226.eqiad.wmnet
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 10:11 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4005.ulsfo.wmnet with OS bookworm
* 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ganeti4005.ulsfo.wmnet
* 10:08 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ganeti4005.ulsfo.wmnet
* 10:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add gw-virtual.ulsfo.wmnet - ayounsi@cumin1003"
* 10:07 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1226.eqiad.wmnet
* 10:07 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1225.eqiad.wmnet
* 09:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1225.eqiad.wmnet
* 09:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1224.eqiad.wmnet
* 09:52 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1224.eqiad.wmnet
* 09:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1223.eqiad.wmnet
* 09:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1223.eqiad.wmnet
* 09:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1222.eqiad.wmnet
* 09:43 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add gw-virtual.ulsfo.wmnet - ayounsi@cumin1003"
* 09:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1222.eqiad.wmnet
* 09:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1221.eqiad.wmnet
* 09:32 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:32 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:28 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1221.eqiad.wmnet
* 09:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1220.eqiad.wmnet
* 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1220.eqiad.wmnet
* 09:02 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 08:38 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247990{{!}}Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580)]] (duration: 07m 07s)
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/airflow-main: apply
* 08:34 mszwarc@deploy2002: mszwarc: Continuing with sync
* 08:33 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1247990{{!}}Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:30 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1247990{{!}}Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580)]]
* 08:29 gehel@dns1004: END - running authdns-update
* 08:28 gehel@dns1004: START - running authdns-update
* 08:27 moritzm: installing mbedtls security updates
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 08:15 hashar@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248314{{!}}Revert "zhwiki: Add 2026 CNY celebration logos"]] (duration: 09m 19s)
* 08:11 hashar@deploy2002: hashar, stang: Continuing with sync
* 08:08 hashar@deploy2002: hashar, stang: Backport for [[gerrit:1248314{{!}}Revert "zhwiki: Add 2026 CNY celebration logos"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:06 hashar@deploy2002: Started scap sync-world: Backport for [[gerrit:1248314{{!}}Revert "zhwiki: Add 2026 CNY celebration logos"]]
* 08:02 moritzm: uploaded openjdk-8 8u482-ga-1~deb11u1 to component/jdk8 of bullseye-wikimedia
* 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts bast4005.wikimedia.org
* 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast4005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast4005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:48 moritzm: uploaded bird2 2.18-1~wmf13u2 to the main component of trixie-wikimedia [[phab:T413740|T413740]]
* 07:47 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:47 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 07:42 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts bast4005.wikimedia.org
* 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Remove es1033 [[phab:T408772|T408772]]', diff saved to https://phabricator.wikimedia.org/P89804 and previous config saved to /var/cache/conftool/dbconfig/20260305-063548-marostegui.json
* 02:10 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 55s)
* 02:02 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 02:01 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248163{{!}}Stop writing to il_to on medium size wikis (T415787)]] (duration: 06m 14s)
* 01:58 zabe@deploy2002: zabe: Continuing with sync
* 01:57 zabe@deploy2002: zabe: Backport for [[gerrit:1248163{{!}}Stop writing to il_to on medium size wikis (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:55 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248163{{!}}Stop writing to il_to on medium size wikis (T415787)]]
* 01:40 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1246099{{!}}Start reading from new file tables on medium wikis (T416548)]] (duration: 06m 15s)
* 01:36 zabe@deploy2002: zabe: Continuing with sync
* 01:36 zabe@deploy2002: zabe: Backport for [[gerrit:1246099{{!}}Start reading from new file tables on medium wikis (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:34 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1246099{{!}}Start reading from new file tables on medium wikis (T416548)]]
* 01:29 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248154{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]], [[gerrit:1248153{{!}}Revert^2 "ImageListPager: Properly support file schema migration read new"]] (duration: 07m 21s)
* 01:25 zabe@deploy2002: zabe: Continuing with sync
* 01:23 zabe@deploy2002: zabe: Backport for [[gerrit:1248154{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]], [[gerrit:1248153{{!}}Revert^2 "ImageListPager: Properly support file schema migration read new"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:21 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248154{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]], [[gerrit:1248153{{!}}Revert^2 "ImageListPager: Properly support file schema migration read new"]]
* 00:55 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248021{{!}}Stop writing to il_to on small wikis (T415787)]] (duration: 06m 49s)
* 00:51 zabe@deploy2002: zabe: Continuing with sync
* 00:50 zabe@deploy2002: zabe: Backport for [[gerrit:1248021{{!}}Stop writing to il_to on small wikis (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:48 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248021{{!}}Stop writing to il_to on small wikis (T415787)]]
* 00:19 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248125{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]], [[gerrit:1248123{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]] (duration: 08m 52s)
* 00:13 zabe@deploy2002: zabe: Continuing with sync
* 00:12 zabe@deploy2002: zabe: Backport for [[gerrit:1248125{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]], [[gerrit:1248123{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:10 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248125{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]], [[gerrit:1248123{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]]
== 2026-03-04 ==
* 22:57 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 22:56 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 22:55 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 22:55 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 22:55 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 22:54 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 22:35 tgr_: UTC late deploys done
* 22:33 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248084{{!}}Introduce a Semantic Search query route and builder (T413969)]], [[gerrit:1248085{{!}}Wire up semantic query building (T413969)]] (duration: 38m 28s)
* 22:16 tgr@deploy2002: tgr, ebernhardson: Continuing with sync
* 22:14 tgr@deploy2002: tgr, ebernhardson: Backport for [[gerrit:1248084{{!}}Introduce a Semantic Search query route and builder (T413969)]], [[gerrit:1248085{{!}}Wire up semantic query building (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248084{{!}}Introduce a Semantic Search query route and builder (T413969)]], [[gerrit:1248085{{!}}Wire up semantic query building (T413969)]]
* 21:48 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248012{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999)]] (duration: 07m 05s)
* 21:47 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on dse-k8s-worker1028.eqiad.wmnet with reason: broken networking
* 21:44 tgr@deploy2002: tgr: Continuing with sync
* 21:43 tgr@deploy2002: tgr: Backport for [[gerrit:1248012{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:40 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248012{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999)]]
* 21:36 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248081{{!}}Add synthetic AAA experiment (T418614)]], [[gerrit:1248080{{!}}Add synthetic AAA experiment (T418614)]] (duration: 09m 11s)
* 21:35 bking@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
* 21:32 tgr@deploy2002: cjming, tgr: Continuing with sync
* 21:30 bking@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
* 21:29 tgr@deploy2002: cjming, tgr: Backport for [[gerrit:1248081{{!}}Add synthetic AAA experiment (T418614)]], [[gerrit:1248080{{!}}Add synthetic AAA experiment (T418614)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:27 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248081{{!}}Add synthetic AAA experiment (T418614)]], [[gerrit:1248080{{!}}Add synthetic AAA experiment (T418614)]]
* 21:21 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1245473{{!}}logging: set poolcounter channel log level to info (T418612)]] (duration: 09m 04s)
* 21:17 tgr@deploy2002: tgr, cwhite: Continuing with sync
* 21:14 tgr@deploy2002: tgr, cwhite: Backport for [[gerrit:1245473{{!}}logging: set poolcounter channel log level to info (T418612)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:12 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1245473{{!}}logging: set poolcounter channel log level to info (T418612)]]
* 21:07 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248007{{!}}Fix $wgJwtSessionCookieIssuer (T415007 T418999)]] (duration: 09m 55s)
* 21:03 tgr@deploy2002: tgr: Continuing with sync
* 20:59 tgr@deploy2002: tgr: Backport for [[gerrit:1248007{{!}}Fix $wgJwtSessionCookieIssuer (T415007 T418999)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:57 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248007{{!}}Fix $wgJwtSessionCookieIssuer (T415007 T418999)]]
* 19:56 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 19:44 jhuneidi@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248011{{!}}CategoryViewer: Fall back to empty string in case of missing nextpage (T418934)]] (duration: 10m 47s)
* 19:44 cdobbins@cumin2002: conftool action : set/pooled=yes:weight=1; selector: name=cp205[0-8].codfw.wmnet
* 19:43 cdobbins@cumin2002: conftool action : set/pooled=yes:weight=1; selector: name=cp2049.codfw.wmnet
* 19:40 jhuneidi@deploy2002: zabe, jhuneidi: Continuing with sync
* 19:35 jhuneidi@deploy2002: zabe, jhuneidi: Backport for [[gerrit:1248011{{!}}CategoryViewer: Fall back to empty string in case of missing nextpage (T418934)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:34 brett@puppetserver1001: conftool action : set/weight=1; selector: name=cp2043.*
* 19:34 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
* 19:33 jhuneidi@deploy2002: Started scap sync-world: Backport for [[gerrit:1248011{{!}}CategoryViewer: Fall back to empty string in case of missing nextpage (T418934)]]
* 19:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2043.codfw.wmnet with OS trixie
* 19:23 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 19:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 19:22 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 19:22 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 19:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2043.codfw.wmnet with reason: host reimage
* 19:06 brett@puppetserver1001: conftool action : set/weight=1; selector: name=cp204[45678].*
* 19:04 brett@puppetserver1001: conftool action : set/weight=100; selector: name=cp204[45678].*
* 19:02 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2043.codfw.wmnet with reason: host reimage
* 18:58 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp204[45678].*
* 18:52 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 18:51 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 18:50 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 18:50 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 18:49 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:49 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:49 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 18:48 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 18:48 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 18:47 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 18:47 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 18:47 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS trixie
* 18:46 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 18:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 18:42 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 18:41 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 18:41 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 18:39 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 18:39 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 18:38 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 18:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 18:37 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:32 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:16 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:16 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:16 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:15 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:15 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:14 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:14 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:13 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:12 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:12 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2047.codfw.wmnet with OS trixie
* 17:33 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 17:27 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 17:23 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:23 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:18 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:18 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:15 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:13 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:09 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp2047.codfw.wmnet with OS trixie
* 16:55 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:55 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:54 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:54 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1007.eqiad.wmnet with OS bookworm
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-unlock-scap (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:39 root@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Datacenter switchover from eqiad to codfw - [[phab:T418133|T418133]] (duration: 25m 37s)
* 16:39 root@deploy2002: Forcefully removing global lock: Datacenter switchover from eqiad to codfw - [[phab:T418133|T418133]]
* 16:39 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-unlock-scap for datacenter switchover from eqiad to codfw
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:27 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters for datacenter switchover from eqiad to codfw
* 16:27 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-restore-ttl (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:26 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-restore-ttl for datacenter switchover from eqiad to codfw
* 16:26 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:26 root@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 16:26 root@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 16:26 root@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 16:26 root@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 16:26 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance for datacenter switchover from eqiad to codfw
* 16:25 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:25 root@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: sync
* 16:25 root@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: sync
* 16:25 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: [DRY-RUN] MediaWiki read-only period ends at: 2026-03-04 16:24:40.502004
* 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.07-set-readwrite for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki for datacenter switchover from eqiad to codfw
* 16:23 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:23 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.03-set-db-readonly for datacenter switchover from eqiad to codfw
* 16:23 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:22 blake@cumin1003: [DRY-RUN] MediaWiki read-only period starts at: 2026-03-04 16:22:41.755892
* 16:22 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.02-set-readonly for datacenter switchover from eqiad to codfw
* 16:20 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 16:20 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:20 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance for datacenter switchover from eqiad to codfw
* 16:19 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:14 moritzm: upgrading cloudservices* to Bird 2.18 [[phab:T413740|T413740]]
* 16:14 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl for datacenter switchover from eqiad to codfw
* 16:13 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-lock-scap (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:13 root@deploy2002: Locking from deployment [ALL REPOSITORIES]: Datacenter switchover from eqiad to codfw - [[phab:T418133|T418133]]
* 16:13 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-lock-scap for datacenter switchover from eqiad to codfw
* 16:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:10 moritzm: remove ganeti4005 from ganeti/ulsfo cluster [[phab:T418993|T418993]]
* 16:10 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1007.eqiad.wmnet with OS bookworm
* 16:06 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:06 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks for datacenter switchover from eqiad to codfw
* 15:59 XioNoX: push pfw policies - [[phab:T418402|T418402]]
* 15:37 sukhe@dns1004: END - running authdns-update
* 15:36 sukhe@dns1004: START - running authdns-update
* 15:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1219.eqiad.wmnet
* 15:32 aqu@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 15:31 aqu@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 15:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
* 15:29 cgoubert@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>ms-fe10[14-24].*<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 15:24 cgoubert@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P<nowiki>{</nowiki>ms-fe10[14-24].*<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 15:22 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:22 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:22 cgoubert@cumin1003: END (ERROR) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=97) rolling restart_daemons on A:swift-fe-eqiad
* 15:21 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1219.eqiad.wmnet
* 15:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1218.eqiad.wmnet
* 15:19 cgoubert@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
* 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1120.eqiad.wmnet
* 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1121.eqiad.wmnet
* 15:16 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1115.eqiad.wmnet [reason: [[phab:T418772|T418772]] - BGP maintenance]
* 15:16 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1122.eqiad.wmnet
* 15:15 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:15 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:14 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:13 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:13 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:10 XioNoX: lsw1-d7-eqiad# tools network-instance default protocols bgp neighbor 10.64.128.17 reset-peer - [[phab:T418772|T418772]]
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
* 15:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1218.eqiad.wmnet
* 15:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1217.eqiad.wmnet
* 15:09 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:05 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:05 moritzm: upgrading cloudlb* to Bird 2.18 [[phab:T413740|T413740]]
* 15:05 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:04 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:58 Dreamy_Jazz: Afternoon UTC backport window done
* 14:58 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244373{{!}}zhwiki: Remove all rights from accountcreator (T418089)]] (duration: 08m 12s)
* 14:57 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:57 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1217.eqiad.wmnet
* 14:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1216.eqiad.wmnet
* 14:57 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:56 btullis@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on dse-k8s-worker[1010-1011,1013,1018-1019].eqiad.wmnet with reason: Adding 10 Gbps NIC
* 14:54 dreamyjazz@deploy2002: dreamyjazz, 1f616emo: Continuing with sync
* 14:52 jmm@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
* 14:52 dreamyjazz@deploy2002: dreamyjazz, 1f616emo: Backport for [[gerrit:1244373{{!}}zhwiki: Remove all rights from accountcreator (T418089)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:50 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1244373{{!}}zhwiki: Remove all rights from accountcreator (T418089)]]
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1216.eqiad.wmnet
* 14:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1215.eqiad.wmnet
* 14:44 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248009{{!}}Hooks: Fix liquidthreads log type definition bugs (T417425 T419006)]], [[gerrit:1248008{{!}}Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425)]] (duration: 07m 11s)
* 14:44 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1115.eqiad.wmnet [reason: [[phab:T418772|T418772]] - BGP maintenance]
* 14:44 taavi: updating CR firewall policy with https://gerrit.wikimedia.org/r/c/operations/homer/public/+/970275
* 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1122.eqiad.wmnet
* 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1121.eqiad.wmnet
* 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1120.eqiad.wmnet
* 14:40 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 14:39 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1248009{{!}}Hooks: Fix liquidthreads log type definition bugs (T417425 T419006)]], [[gerrit:1248008{{!}}Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:37 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1248009{{!}}Hooks: Fix liquidthreads log type definition bugs (T417425 T419006)]], [[gerrit:1248008{{!}}Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425)]]
* 14:33 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1215.eqiad.wmnet
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1214.eqiad.wmnet
* 14:32 btullis@puppetserver1001: conftool action : get/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1025.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1025.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
* 14:30 btullis@puppetserver1001: conftool action : get/pooled; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
* 14:29 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_esams and A:cp - 3.0 upgrade ()
* 14:27 arnaudb@dns1004: END - running authdns-update
* 14:26 arnaudb@dns1004: START - running authdns-update
* 14:26 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248000{{!}}Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999)]] (duration: 07m 19s)
* 14:22 tgr@deploy2002: tgr: Continuing with sync
* 14:21 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1214.eqiad.wmnet
* 14:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1213.eqiad.wmnet
* 14:21 tgr@deploy2002: tgr: Backport for [[gerrit:1248000{{!}}Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:19 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248000{{!}}Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999)]]
* 14:14 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247566{{!}}Enable new HTML confirmation emails for all (T416748)]] (duration: 07m 46s)
* 14:13 elukey@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:13 elukey@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:10 sgimeno@deploy2002: migr, sgimeno: Continuing with sync
* 14:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1213.eqiad.wmnet
* 14:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1212.eqiad.wmnet
* 14:09 sgimeno@deploy2002: migr, sgimeno: Backport for [[gerrit:1247566{{!}}Enable new HTML confirmation emails for all (T416748)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 elukey@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 14:08 elukey@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 14:08 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:07 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:07 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1247566{{!}}Enable new HTML confirmation emails for all (T416748)]]
* 13:59 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
* 13:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
* 13:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
* 13:57 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1212.eqiad.wmnet
* 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1211.eqiad.wmnet
* 13:49 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_esams and A:cp - 3.0 upgrade ()
* 13:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1211.eqiad.wmnet
* 13:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1210.eqiad.wmnet
* 13:43 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_esams and A:cp - 3.0 upgrade ()
* 13:40 arnaudb@dns1004: END - running authdns-update
* 13:39 arnaudb@dns1004: START - running authdns-update
* 13:37 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 13:33 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1210.eqiad.wmnet
* 13:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1209.eqiad.wmnet
* 13:20 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1209.eqiad.wmnet
* 13:20 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1208.eqiad.wmnet
* 13:17 aokoth@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:17 aokoth@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:16 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:15 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1208.eqiad.wmnet
* 13:06 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1207.eqiad.wmnet
* 13:03 arnaudb@dns1005: END - running authdns-update
* 13:02 arnaudb@dns1005: START - running authdns-update
* 13:00 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_esams and A:cp - 3.0 upgrade ()
* 13:00 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 12:46 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 12:45 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 12:44 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 12:44 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
* 12:43 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 12:43 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
* 12:33 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 12:29 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
* 12:10 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 12:08 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp - 3.0 upgrade ()
* 12:03 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1207.eqiad.wmnet
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1206.eqiad.wmnet
* 11:53 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1206.eqiad.wmnet
* 11:53 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1205.eqiad.wmnet
* 11:36 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f8-eqiad
* 11:36 jmm@cumin2002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
* 11:34 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp - 3.0 upgrade ()
* 11:34 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 11:28 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247968{{!}}SI: Update instrumentation schema (T418293)]] (duration: 16m 22s)
* 11:22 fabfur: start upgrading haproxy to 3.0 on A:cp-eqiad ([[phab:T417253|T417253]])
* 11:22 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 11:17 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1247968{{!}}SI: Update instrumentation schema (T418293)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:13 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp - 3.0 upgrade ()
* 11:12 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1247968{{!}}SI: Update instrumentation schema (T418293)]]
* 11:08 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp - 3.0 upgrade ()
* 11:07 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 11:07 blake@cumin1003: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 11:06 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2356].codfw.wmnet
* 11:06 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2356].codfw.wmnet
* 11:03 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 11:03 blake@cumin1003: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 10:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2356].codfw.wmnet
* 10:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2356].codfw.wmnet
* 10:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1205.eqiad.wmnet
* 10:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1204.eqiad.wmnet
* 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:42 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1204.eqiad.wmnet
* 10:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1203.eqiad.wmnet
* 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1203.eqiad.wmnet
* 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1202.eqiad.wmnet
* 10:25 fabfur: start upgrading haproxy to 3.0 on A:cp-drmrs ([[phab:T417253|T417253]])
* 10:25 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp - 3.0 upgrade ()
* 10:25 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp - 3.0 upgrade ()
* 10:24 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247941{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]], [[gerrit:1247944{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]] (duration: 06m 42s)
* 10:22 arnaudb@dns1004: END - running authdns-update
* 10:20 arnaudb@dns1004: START - running authdns-update
* 10:20 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 10:20 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1247941{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]], [[gerrit:1247944{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:18 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1247941{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]], [[gerrit:1247944{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]]
* 10:16 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1202.eqiad.wmnet
* 10:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1201.eqiad.wmnet
* 10:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:04 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1201.eqiad.wmnet
* 10:04 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1200.eqiad.wmnet
* 09:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1200.eqiad.wmnet
* 09:39 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247925{{!}}Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880)]] (duration: 08m 23s)
* 09:36 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_codfw and A:cp - 3.0 upgrade ()
* 09:35 mszwarc@deploy2002: mszwarc: Continuing with sync
* 09:33 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1247925{{!}}Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet
* 09:31 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp - 3.0 upgrade ()
* 09:31 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1247925{{!}}Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880)]]
* 09:30 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:03 gehel: switching off Blazegraph on wdqs2009 (legacy full graph endpoint is end of life) - [[phab:T411410|T411410]] / [[phab:T415073|T415073]]
* 09:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:02 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet
* 09:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 08:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 08:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:56 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet
* 08:54 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 08:52 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 08:49 topranks: disabling IBGP session between ssw1-d1-eqiad and ssw1-d8-eqiad to remove backup paths try #2 [[phab:T411054|T411054]]
* 08:36 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on backup1007.eqiad.wmnet,dbprov1004.eqiad.wmnet with reason: network maintenance
* 08:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:31 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:21 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp - 3.0 upgrade ()
* 08:21 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_codfw and A:cp - 3.0 upgrade ()
* 08:11 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5032.*
* 07:54 topranks: disabling IBGP session between ssw1-d1-eqiad and ssw1-d8-eqiad to remove backup paths [[phab:T411054|T411054]]
* 07:43 moritzm: installing libbpf updates from Bookworm point release
* 05:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 05:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 04s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 01:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89793 and previous config saved to /var/cache/conftool/dbconfig/20260304-015657-marostegui.json
* 01:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P89792 and previous config saved to /var/cache/conftool/dbconfig/20260304-014150-marostegui.json
* 01:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P89791 and previous config saved to /var/cache/conftool/dbconfig/20260304-012642-marostegui.json
* 01:23 zabe@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 01:22 zabe@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89790 and previous config saved to /var/cache/conftool/dbconfig/20260304-011134-marostegui.json
* 00:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1263 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89789 and previous config saved to /var/cache/conftool/dbconfig/20260304-004638-marostegui.json
* 00:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1263.eqiad.wmnet with reason: Maintenance
* 00:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89788 and previous config saved to /var/cache/conftool/dbconfig/20260304-004615-marostegui.json
* 00:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P89787 and previous config saved to /var/cache/conftool/dbconfig/20260304-003107-marostegui.json
* 00:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P89786 and previous config saved to /var/cache/conftool/dbconfig/20260304-001559-marostegui.json
* 00:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89785 and previous config saved to /var/cache/conftool/dbconfig/20260304-000052-marostegui.json
== 2026-03-03 ==
* 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1262 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89784 and previous config saved to /var/cache/conftool/dbconfig/20260303-233500-marostegui.json
* 23:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1262.eqiad.wmnet with reason: Maintenance
* 23:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89783 and previous config saved to /var/cache/conftool/dbconfig/20260303-233436-marostegui.json
* 23:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P89782 and previous config saved to /var/cache/conftool/dbconfig/20260303-231929-marostegui.json
* 23:10 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 23:08 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 23:08 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:07 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:05 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 23:05 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 23:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P89781 and previous config saved to /var/cache/conftool/dbconfig/20260303-230421-marostegui.json
* 23:04 bking@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
* 23:02 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247689{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247690{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247596{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007)]] (duration: 21m 47s)
* 23:00 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7008.magru.wmnet [reason: lldpd packet drop issues]
* 22:58 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7008 [reason: lldpd packet drop issues]
* 22:58 tgr@deploy2002: tgr: Continuing with sync
* 22:56 bking@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
* 22:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89780 and previous config saved to /var/cache/conftool/dbconfig/20260303-224913-marostegui.json
* 22:45 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:45 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:44 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:44 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:42 tgr@deploy2002: tgr: Backport for [[gerrit:1247689{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247690{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247596{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1247689{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247690{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247596{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007)]]
* 22:26 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 22:26 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 22:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1261 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89779 and previous config saved to /var/cache/conftool/dbconfig/20260303-222324-marostegui.json
* 22:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1261.eqiad.wmnet with reason: Maintenance
* 22:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89778 and previous config saved to /var/cache/conftool/dbconfig/20260303-222301-marostegui.json
* 22:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P89777 and previous config saved to /var/cache/conftool/dbconfig/20260303-220754-marostegui.json
* 21:59 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1245162 [[phab:T411807|T411807]] (duration: 12m 15s)
* 21:58 rzl@deploy2002: rzl: Continuing with sync
* 21:56 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:56 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:55 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1245162 [[phab:T411807|T411807]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1245162 [[phab:T411807|T411807]]
* 21:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P89776 and previous config saved to /var/cache/conftool/dbconfig/20260303-215247-marostegui.json
* 21:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89775 and previous config saved to /var/cache/conftool/dbconfig/20260303-214931-marostegui.json
* 21:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2045.codfw.wmnet
* 21:48 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp2045.codfw.wmnet
* 21:40 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:39 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89774 and previous config saved to /var/cache/conftool/dbconfig/20260303-213739-marostegui.json
* 21:35 jhuneidi@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244748{{!}}REST: show the beta Attribution API in the REST Sandbox (T418522)]] (duration: 07m 41s)
* 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P89773 and previous config saved to /var/cache/conftool/dbconfig/20260303-213423-marostegui.json
* 21:32 jhuneidi@deploy2002: jhuneidi, bpirkle: Continuing with sync
* 21:30 jhuneidi@deploy2002: jhuneidi, bpirkle: Backport for [[gerrit:1244748{{!}}REST: show the beta Attribution API in the REST Sandbox (T418522)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:28 jhuneidi@deploy2002: Started scap sync-world: Backport for [[gerrit:1244748{{!}}REST: show the beta Attribution API in the REST Sandbox (T418522)]]
* 21:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P89772 and previous config saved to /var/cache/conftool/dbconfig/20260303-211915-marostegui.json
* 21:18 jhuneidi@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247652{{!}}Remove redundant mw-extra wgRestSandboxSpecs entry]] (duration: 06m 56s)
* 21:14 jhuneidi@deploy2002: jhuneidi, aaron: Continuing with sync
* 21:13 jhuneidi@deploy2002: jhuneidi, aaron: Backport for [[gerrit:1247652{{!}}Remove redundant mw-extra wgRestSandboxSpecs entry]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:11 jhuneidi@deploy2002: Started scap sync-world: Backport for [[gerrit:1247652{{!}}Remove redundant mw-extra wgRestSandboxSpecs entry]]
* 21:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1260 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89771 and previous config saved to /var/cache/conftool/dbconfig/20260303-211033-marostegui.json
* 21:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1260.eqiad.wmnet with reason: Maintenance
* 21:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89770 and previous config saved to /var/cache/conftool/dbconfig/20260303-211009-marostegui.json
* 21:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89769 and previous config saved to /var/cache/conftool/dbconfig/20260303-210407-marostegui.json
* 20:58 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2045.codfw.wmnet with reason: troubleshooting for [[phab:T418527|T418527]]
* 20:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P89768 and previous config saved to /var/cache/conftool/dbconfig/20260303-205502-marostegui.json
* 20:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7008.magru.wmnet with OS trixie
* 20:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89767 and previous config saved to /var/cache/conftool/dbconfig/20260303-204452-marostegui.json
* 20:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2248.codfw.wmnet with reason: Maintenance
* 20:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89766 and previous config saved to /var/cache/conftool/dbconfig/20260303-204439-marostegui.json
* 20:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P89765 and previous config saved to /var/cache/conftool/dbconfig/20260303-203954-marostegui.json
* 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P89764 and previous config saved to /var/cache/conftool/dbconfig/20260303-202931-marostegui.json
* 20:24 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7008.magru.wmnet with reason: host reimage
* 20:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89763 and previous config saved to /var/cache/conftool/dbconfig/20260303-202447-marostegui.json
* 20:17 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7008.magru.wmnet with reason: host reimage
* 20:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P89762 and previous config saved to /var/cache/conftool/dbconfig/20260303-201423-marostegui.json
* 20:10 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1199.eqiad.wmnet
* 19:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89761 and previous config saved to /var/cache/conftool/dbconfig/20260303-195916-marostegui.json
* 19:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1252 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89760 and previous config saved to /var/cache/conftool/dbconfig/20260303-195900-marostegui.json
* 19:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1252.eqiad.wmnet with reason: Maintenance
* 19:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89759 and previous config saved to /var/cache/conftool/dbconfig/20260303-195835-marostegui.json
* 19:51 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7008.magru.wmnet with OS trixie
* 19:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P89758 and previous config saved to /var/cache/conftool/dbconfig/20260303-194327-marostegui.json
* 19:42 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2043.codfw.wmnet
* 19:42 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp2043.codfw.wmnet
* 19:33 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89757 and previous config saved to /var/cache/conftool/dbconfig/20260303-193351-marostegui.json
* 19:33 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 19:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89756 and previous config saved to /var/cache/conftool/dbconfig/20260303-193338-marostegui.json
* 19:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P89755 and previous config saved to /var/cache/conftool/dbconfig/20260303-192820-marostegui.json
* 19:19 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 19:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P89754 and previous config saved to /var/cache/conftool/dbconfig/20260303-191830-marostegui.json
* 19:17 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2047.codfw.wmnet with OS trixie
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89753 and previous config saved to /var/cache/conftool/dbconfig/20260303-191312-marostegui.json
* 19:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P89752 and previous config saved to /var/cache/conftool/dbconfig/20260303-190323-marostegui.json
* 18:53 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 18:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
* 18:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1198.eqiad.wmnet
* 18:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1249 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89751 and previous config saved to /var/cache/conftool/dbconfig/20260303-184937-marostegui.json
* 18:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1249.eqiad.wmnet with reason: Maintenance
* 18:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89750 and previous config saved to /var/cache/conftool/dbconfig/20260303-184913-marostegui.json
* 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89749 and previous config saved to /var/cache/conftool/dbconfig/20260303-184815-marostegui.json
* 18:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 18:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1096.eqiad.wmnet with OS bullseye
* 18:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1198.eqiad.wmnet
* 18:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1197.eqiad.wmnet
* 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P89747 and previous config saved to /var/cache/conftool/dbconfig/20260303-183406-marostegui.json
* 18:29 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp2047.codfw.wmnet with OS trixie
* 18:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1197.eqiad.wmnet
* 18:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1196.eqiad.wmnet
* 18:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89746 and previous config saved to /var/cache/conftool/dbconfig/20260303-182346-marostegui.json
* 18:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1096.eqiad.wmnet with reason: host reimage
* 18:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 18:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89745 and previous config saved to /var/cache/conftool/dbconfig/20260303-182321-marostegui.json
* 18:19 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1096.eqiad.wmnet with reason: host reimage
* 18:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P89744 and previous config saved to /var/cache/conftool/dbconfig/20260303-181859-marostegui.json
* 18:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1196.eqiad.wmnet
* 18:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1195.eqiad.wmnet
* 18:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P89743 and previous config saved to /var/cache/conftool/dbconfig/20260303-180814-marostegui.json
* 18:04 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247635{{!}}Style fixes for copy-paste feature (T414072)]] (duration: 32m 54s)
* 18:04 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89742 and previous config saved to /var/cache/conftool/dbconfig/20260303-180352-marostegui.json
* 18:02 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1096.eqiad.wmnet with OS bullseye
* 18:02 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1195.eqiad.wmnet
* 17:59 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host an-worker1194.eqiad.wmnet
* 17:55 ariel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:53 ariel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P89741 and previous config saved to /var/cache/conftool/dbconfig/20260303-175304-marostegui.json
* 17:52 jforrester@deploy2002: jforrester: Continuing with sync
* 17:51 jforrester@deploy2002: jforrester: Backport for [[gerrit:1247635{{!}}Style fixes for copy-paste feature (T414072)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:47 ariel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:46 ariel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1194.eqiad.wmnet
* 17:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1193.eqiad.wmnet
* 17:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89740 and previous config saved to /var/cache/conftool/dbconfig/20260303-173914-marostegui.json
* 17:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1248.eqiad.wmnet with reason: Maintenance
* 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89739 and previous config saved to /var/cache/conftool/dbconfig/20260303-173850-marostegui.json
* 17:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89738 and previous config saved to /var/cache/conftool/dbconfig/20260303-173756-marostegui.json
* 17:31 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1247635{{!}}Style fixes for copy-paste feature (T414072)]]
* 17:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1193.eqiad.wmnet
* 17:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1192.eqiad.wmnet
* 17:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P89736 and previous config saved to /var/cache/conftool/dbconfig/20260303-172343-marostegui.json
* 17:18 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1192.eqiad.wmnet
* 17:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1191.eqiad.wmnet
* 17:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89735 and previous config saved to /var/cache/conftool/dbconfig/20260303-171149-marostegui.json
* 17:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 17:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89734 and previous config saved to /var/cache/conftool/dbconfig/20260303-171126-marostegui.json
* 17:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P89733 and previous config saved to /var/cache/conftool/dbconfig/20260303-170835-marostegui.json
* 17:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1191.eqiad.wmnet
* 17:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1190.eqiad.wmnet
* 16:56 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1190.eqiad.wmnet
* 16:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P89732 and previous config saved to /var/cache/conftool/dbconfig/20260303-165618-marostegui.json
* 16:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89731 and previous config saved to /var/cache/conftool/dbconfig/20260303-165327-marostegui.json
* 16:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1189.eqiad.wmnet
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P89730 and previous config saved to /var/cache/conftool/dbconfig/20260303-164111-marostegui.json
* 16:34 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1189.eqiad.wmnet
* 16:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1188.eqiad.wmnet
* 16:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89729 and previous config saved to /var/cache/conftool/dbconfig/20260303-162845-marostegui.json
* 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Setting x1 codfw weights to 300 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89728 and previous config saved to /var/cache/conftool/dbconfig/20260303-162836-fceratto.json
* 16:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1247.eqiad.wmnet with reason: Maintenance
* 16:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89727 and previous config saved to /var/cache/conftool/dbconfig/20260303-162603-marostegui.json
* 16:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 16:18 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1188 weight to 100 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89726 and previous config saved to /var/cache/conftool/dbconfig/20260303-161846-fceratto.json
* 16:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 16:17 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1188.eqiad.wmnet
* 16:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1187.eqiad.wmnet
* 16:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1166: testing:crash
* 16:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1166: testing:crash
* 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1169 weight to 300 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89724 and previous config saved to /var/cache/conftool/dbconfig/20260303-161323-fceratto.json
* 16:12 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1188 weight to 300 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89723 and previous config saved to /var/cache/conftool/dbconfig/20260303-161230-fceratto.json
* 16:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 16:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89722 and previous config saved to /var/cache/conftool/dbconfig/20260303-160720-marostegui.json
* 16:07 brennen@deploy2002: Finished deploy [phabricator/deployment@a883b6d]: deploy phab1004 for [[phab:T418872|T418872]] (duration: 01m 07s)
* 16:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1187.eqiad.wmnet
* 16:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1186.eqiad.wmnet
* 16:05 brennen@deploy2002: Started deploy [phabricator/deployment@a883b6d]: deploy phab1004 for [[phab:T418872|T418872]]
* 16:05 brennen@deploy2002: Finished deploy [phabricator/deployment@a883b6d]: deploy phab2002 for [[phab:T418872|T418872]] (duration: 00m 32s)
* 16:04 brennen@deploy2002: Started deploy [phabricator/deployment@a883b6d]: deploy phab2002 for [[phab:T418872|T418872]]
* 16:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2240 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89721 and previous config saved to /var/cache/conftool/dbconfig/20260303-160207-marostegui.json
* 16:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 16:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2240.codfw.wmnet with reason: Maintenance
* 16:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 16:00 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247569{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]] (duration: 09m 28s)
* 15:54 zabe@deploy2002: zabe: Continuing with sync
* 15:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1186.eqiad.wmnet
* 15:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1185.eqiad.wmnet
* 15:54 zabe@deploy2002: zabe: Backport for [[gerrit:1247569{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:53 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 15:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P89720 and previous config saved to /var/cache/conftool/dbconfig/20260303-155212-marostegui.json
* 15:50 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1247569{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]]
* 15:49 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 15:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:42 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1185.eqiad.wmnet
* 15:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1184.eqiad.wmnet
* 15:42 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:41 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 15:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 15:41 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 15:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89719 and previous config saved to /var/cache/conftool/dbconfig/20260303-154104-marostegui.json
* 15:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P89718 and previous config saved to /var/cache/conftool/dbconfig/20260303-153704-marostegui.json
* 15:36 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 15:36 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 15:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1184.eqiad.wmnet
* 15:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1183.eqiad.wmnet
* 15:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P89717 and previous config saved to /var/cache/conftool/dbconfig/20260303-152557-marostegui.json
* 15:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1178.eqiad.wmnet
* 15:22 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp5032.*<nowiki>}</nowiki> and A:cp - 3.0 upgrade ()
* 15:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89716 and previous config saved to /var/cache/conftool/dbconfig/20260303-152157-marostegui.json
* 15:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1183.eqiad.wmnet
* 15:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1182.eqiad.wmnet
* 15:16 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp5032.*<nowiki>}</nowiki> and A:cp - 3.0 upgrade ()
* 15:15 fabfur@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=1) rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 15:14 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 15:14 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 15:13 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 15:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 15:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P89715 and previous config saved to /var/cache/conftool/dbconfig/20260303-151049-marostegui.json
* 15:08 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:07 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1182.eqiad.wmnet
* 15:07 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1181.eqiad.wmnet
* 14:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1244 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89714 and previous config saved to /var/cache/conftool/dbconfig/20260303-145727-marostegui.json
* 14:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1244.eqiad.wmnet with reason: Maintenance
* 14:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89713 and previous config saved to /var/cache/conftool/dbconfig/20260303-145704-marostegui.json
* 14:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89712 and previous config saved to /var/cache/conftool/dbconfig/20260303-145541-marostegui.json
* 14:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1181.eqiad.wmnet
* 14:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1180.eqiad.wmnet
* 14:49 moritzm: installing php7.4 security updates
* 14:46 jayme@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 14:46 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 14:43 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1180.eqiad.wmnet
* 14:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1179.eqiad.wmnet
* 14:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P89711 and previous config saved to /var/cache/conftool/dbconfig/20260303-144156-marostegui.json
* 14:38 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 14:38 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240716{{!}}Remove Editing-related config for special wikis (T400063)]] (duration: 06m 34s)
* 14:36 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:34 esanders@deploy2002: esanders: Continuing with sync
* 14:34 esanders@deploy2002: esanders: Backport for [[gerrit:1240716{{!}}Remove Editing-related config for special wikis (T400063)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:34 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:32 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1240716{{!}}Remove Editing-related config for special wikis (T400063)]]
* 14:31 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1179.eqiad.wmnet
* 14:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1178.eqiad.wmnet
* 14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89710 and previous config saved to /var/cache/conftool/dbconfig/20260303-143141-marostegui.json
* 14:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89709 and previous config saved to /var/cache/conftool/dbconfig/20260303-143117-marostegui.json
* 14:29 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247578{{!}}PasteCheck: Enable by default (T405127)]] (duration: 08m 01s)
* 14:27 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 14:27 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 14:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P89708 and previous config saved to /var/cache/conftool/dbconfig/20260303-142649-marostegui.json
* 14:26 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 14:25 esanders@deploy2002: esanders: Continuing with sync
* 14:23 esanders@deploy2002: esanders: Backport for [[gerrit:1247578{{!}}PasteCheck: Enable by default (T405127)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:21 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1247578{{!}}PasteCheck: Enable by default (T405127)]]
* 14:20 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 14:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P89707 and previous config saved to /var/cache/conftool/dbconfig/20260303-141610-marostegui.json
* 14:15 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247576{{!}}Enable Wikibase GraphQL on test.wikidata.org (T417619)]], [[gerrit:1247577{{!}}Enable Wikibase GraphQL on production wikidata.org (T417619)]] (duration: 08m 17s)
* 14:11 esanders@deploy2002: esanders, jakob: Continuing with sync
* 14:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89706 and previous config saved to /var/cache/conftool/dbconfig/20260303-141142-marostegui.json
* 14:09 esanders@deploy2002: esanders, jakob: Backport for [[gerrit:1247576{{!}}Enable Wikibase GraphQL on test.wikidata.org (T417619)]], [[gerrit:1247577{{!}}Enable Wikibase GraphQL on production wikidata.org (T417619)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1247576{{!}}Enable Wikibase GraphQL on test.wikidata.org (T417619)]], [[gerrit:1247577{{!}}Enable Wikibase GraphQL on production wikidata.org (T417619)]]
* 14:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P89704 and previous config saved to /var/cache/conftool/dbconfig/20260303-140102-marostegui.json
* 13:47 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1243 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89703 and previous config saved to /var/cache/conftool/dbconfig/20260303-134702-marostegui.json
* 13:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1243.eqiad.wmnet with reason: Maintenance
* 13:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89702 and previous config saved to /var/cache/conftool/dbconfig/20260303-134639-marostegui.json
* 13:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89701 and previous config saved to /var/cache/conftool/dbconfig/20260303-134554-marostegui.json
* 13:31 moritzm: installing NSS security updates
* 13:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P89700 and previous config saved to /var/cache/conftool/dbconfig/20260303-133131-marostegui.json
* 13:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89699 and previous config saved to /var/cache/conftool/dbconfig/20260303-132414-marostegui.json
* 13:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 13:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89698 and previous config saved to /var/cache/conftool/dbconfig/20260303-132350-marostegui.json
* 13:20 tappof: Thanos: re-enable querier<->ruler cross-site traffic [[phab:T412924|T412924]]
* 13:17 dpogorzelski@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=recommendation-api,name=eqiad
* 13:17 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P89697 and previous config saved to /var/cache/conftool/dbconfig/20260303-131624-marostegui.json
* 13:16 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 13:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1177.eqiad.wmnet
* 13:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1359.eqiad.wmnet with OS trixie
* 13:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P89696 and previous config saved to /var/cache/conftool/dbconfig/20260303-130842-marostegui.json
* 13:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89695 and previous config saved to /var/cache/conftool/dbconfig/20260303-130117-marostegui.json
* 13:01 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:00 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1177.eqiad.wmnet
* 13:00 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1176.eqiad.wmnet
* 12:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1358.eqiad.wmnet with OS trixie
* 12:56 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:55 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:53 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1359.eqiad.wmnet with reason: host reimage
* 12:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P89694 and previous config saved to /var/cache/conftool/dbconfig/20260303-125335-marostegui.json
* 12:52 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1357.eqiad.wmnet with OS trixie
* 12:51 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:50 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:48 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1359.eqiad.wmnet with reason: host reimage
* 12:48 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1356.eqiad.wmnet with OS trixie
* 12:47 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:47 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 12:47 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:47 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1176.eqiad.wmnet
* 12:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1175.eqiad.wmnet
* 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 12:45 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 12:45 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'readability' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 12:43 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1358.eqiad.wmnet with reason: host reimage
* 12:42 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 12:42 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 12:41 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:40 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247559{{!}}Enable thumb steps on private wikis too (T414805)]] (duration: 13m 01s)
* 12:39 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89693 and previous config saved to /var/cache/conftool/dbconfig/20260303-123827-marostegui.json
* 12:36 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1357.eqiad.wmnet with reason: host reimage
* 12:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1242 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89692 and previous config saved to /var/cache/conftool/dbconfig/20260303-123642-marostegui.json
* 12:36 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1359.eqiad.wmnet with OS trixie
* 12:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1242.eqiad.wmnet with reason: Maintenance
* 12:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89691 and previous config saved to /var/cache/conftool/dbconfig/20260303-123619-marostegui.json
* 12:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1175.eqiad.wmnet
* 12:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1174.eqiad.wmnet
* 12:34 dpogorzelski@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=recommendation-api,name=eqiad
* 12:33 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 12:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1356.eqiad.wmnet with reason: host reimage
* 12:31 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:31 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:31 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:31 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1247559{{!}}Enable thumb steps on private wikis too (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:30 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1358.eqiad.wmnet with reason: host reimage
* 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1357.eqiad.wmnet with reason: host reimage
* 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1356.eqiad.wmnet with reason: host reimage
* 12:27 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:27 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1247559{{!}}Enable thumb steps on private wikis too (T414805)]]
* 12:26 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1174.eqiad.wmnet
* 12:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1173.eqiad.wmnet
* 12:21 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P89690 and previous config saved to /var/cache/conftool/dbconfig/20260303-122112-marostegui.json
* 12:20 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:20 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:19 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1353.eqiad.wmnet with OS trixie
* 12:16 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1358.eqiad.wmnet with OS trixie
* 12:16 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1357.eqiad.wmnet with OS trixie
* 12:15 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1356.eqiad.wmnet with OS trixie
* 12:14 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1355.eqiad.wmnet with OS trixie
* 12:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89689 and previous config saved to /var/cache/conftool/dbconfig/20260303-121420-marostegui.json
* 12:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 12:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89688 and previous config saved to /var/cache/conftool/dbconfig/20260303-121355-marostegui.json
* 12:09 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1354.eqiad.wmnet with OS trixie
* 12:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1173.eqiad.wmnet
* 12:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1172.eqiad.wmnet
* 12:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P89687 and previous config saved to /var/cache/conftool/dbconfig/20260303-120604-marostegui.json
* 12:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1352.eqiad.wmnet with OS trixie
* 12:02 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1353.eqiad.wmnet with reason: host reimage
* 11:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P89686 and previous config saved to /var/cache/conftool/dbconfig/20260303-115847-marostegui.json
* 11:58 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1355.eqiad.wmnet with reason: host reimage
* 11:52 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1354.eqiad.wmnet with reason: host reimage
* 11:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89685 and previous config saved to /var/cache/conftool/dbconfig/20260303-115057-marostegui.json
* 11:48 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1352.eqiad.wmnet with reason: host reimage
* 11:44 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1355.eqiad.wmnet with reason: host reimage
* 11:43 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1354.eqiad.wmnet with reason: host reimage
* 11:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P89684 and previous config saved to /var/cache/conftool/dbconfig/20260303-114341-marostegui.json
* 11:43 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1353.eqiad.wmnet with reason: host reimage
* 11:42 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1352.eqiad.wmnet with reason: host reimage
* 11:40 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 11:36 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 11:31 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1355.eqiad.wmnet with OS trixie
* 11:31 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1354.eqiad.wmnet with OS trixie
* 11:30 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1353.eqiad.wmnet with OS trixie
* 11:30 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1352.eqiad.wmnet with OS trixie
* 11:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T418465|T418465]])', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260303-112828-marostegui.json
* 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1241 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89683 and previous config saved to /var/cache/conftool/dbconfig/20260303-112535-marostegui.json
* 11:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1241.eqiad.wmnet with reason: Maintenance
* 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89682 and previous config saved to /var/cache/conftool/dbconfig/20260303-112511-marostegui.json
* 11:21 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:18 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:18 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:17 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:17 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:16 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1350-1351].eqiad.wmnet
* 11:16 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1350-1351].eqiad.wmnet
* 11:15 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:15 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:15 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:15 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:14 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 11:14 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 11:13 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 11:13 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 11:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1172.eqiad.wmnet
* 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1171.eqiad.wmnet
* 11:13 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 11:13 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 11:12 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 11:11 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 11:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P89681 and previous config saved to /var/cache/conftool/dbconfig/20260303-111003-marostegui.json
* 11:09 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:08 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 11:08 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 11:07 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 11:07 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 11:06 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 11:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89680 and previous config saved to /var/cache/conftool/dbconfig/20260303-110551-marostegui.json
* 11:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2210.codfw.wmnet with reason: Maintenance
* 11:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89679 and previous config saved to /var/cache/conftool/dbconfig/20260303-110527-marostegui.json
* 10:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1171.eqiad.wmnet
* 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1170.eqiad.wmnet
* 10:57 slyngshede@dns1004: END - running authdns-update
* 10:55 slyngshede@dns1004: START - running authdns-update
* 10:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P89678 and previous config saved to /var/cache/conftool/dbconfig/20260303-105455-marostegui.json
* 10:54 hashar@deploy2002: Finished deploy [gerrit/gerrit@12177b1]: wm-checks-api: add tag for Selenium jobs (duration: 00m 13s)
* 10:54 hashar@deploy2002: Started deploy [gerrit/gerrit@12177b1]: wm-checks-api: add tag for Selenium jobs
* 10:51 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 10:51 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 10:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P89677 and previous config saved to /var/cache/conftool/dbconfig/20260303-105020-marostegui.json
* 10:47 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1170.eqiad.wmnet
* 10:45 fabfur: start upgrading haproxy to 3.0 on A:cp-eqsin ([[phab:T417253|T417253]])
* 10:43 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 10:41 moritzm: installing Django security updates
* 10:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89676 and previous config saved to /var/cache/conftool/dbconfig/20260303-103947-marostegui.json
* 10:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P89675 and previous config saved to /var/cache/conftool/dbconfig/20260303-103512-marostegui.json
* 10:34 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:33 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:31 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 10:25 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 10:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89674 and previous config saved to /var/cache/conftool/dbconfig/20260303-102004-marostegui.json
* 10:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1238 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89673 and previous config saved to /var/cache/conftool/dbconfig/20260303-101800-marostegui.json
* 10:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1238.eqiad.wmnet with reason: Maintenance
* 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89672 and previous config saved to /var/cache/conftool/dbconfig/20260303-101747-marostegui.json
* 09:57 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89670 and previous config saved to /var/cache/conftool/dbconfig/20260303-095655-marostegui.json
* 09:56 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 09:53 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:51 moritzm: installing qemu security updates
* 09:48 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 09:48 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 09:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P89669 and previous config saved to /var/cache/conftool/dbconfig/20260303-094732-marostegui.json
* 09:47 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 09:47 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 09:45 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 09:45 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 09:44 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 09:44 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 09:44 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 09:44 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1176.eqiad.wmnet with reason: host reimage
* 09:43 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 09:40 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 09:38 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1176.eqiad.wmnet with reason: host reimage
* 09:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2199.codfw.wmnet with reason: Maintenance
* 09:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89668 and previous config saved to /var/cache/conftool/dbconfig/20260303-093542-marostegui.json
* 09:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89667 and previous config saved to /var/cache/conftool/dbconfig/20260303-093224-marostegui.json
* 09:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 09:23 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 09:23 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1176.eqiad.wmnet with OS trixie
* 09:21 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 09:20 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 09:20 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 09:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P89666 and previous config saved to /var/cache/conftool/dbconfig/20260303-092034-marostegui.json
* 09:19 arnaudb@dns1004: END - running authdns-update
* 09:18 arnaudb@dns1004: START - running authdns-update
* 09:17 moritzm: installing libbpf updates from Bookworm point release
* 09:08 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89665 and previous config saved to /var/cache/conftool/dbconfig/20260303-090818-marostegui.json
* 09:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on 6 hosts with reason: Maintenance
* 09:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1221.eqiad.wmnet with reason: Maintenance
* 09:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89664 and previous config saved to /var/cache/conftool/dbconfig/20260303-090731-marostegui.json
* 09:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P89663 and previous config saved to /var/cache/conftool/dbconfig/20260303-090526-marostegui.json
* 08:54 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
* 08:53 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
* 08:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P89662 and previous config saved to /var/cache/conftool/dbconfig/20260303-085224-marostegui.json
* 08:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89661 and previous config saved to /var/cache/conftool/dbconfig/20260303-085019-marostegui.json
* 08:47 moritzm: powercycling lvs1013
* 08:41 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 08:41 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 08:37 fabfur: start upgrading haproxy to 3.0 on A:cp-ulsfo ([[phab:T417253|T417253]])
* 08:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P89660 and previous config saved to /var/cache/conftool/dbconfig/20260303-083716-marostegui.json
* 08:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 08:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:31 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 08:30 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 08:28 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
* 08:27 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
* 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89659 and previous config saved to /var/cache/conftool/dbconfig/20260303-082424-marostegui.json
* 08:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89658 and previous config saved to /var/cache/conftool/dbconfig/20260303-082400-marostegui.json
* 08:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89657 and previous config saved to /var/cache/conftool/dbconfig/20260303-082209-marostegui.json
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P89656 and previous config saved to /var/cache/conftool/dbconfig/20260303-080853-marostegui.json
* 08:07 moritzm: installing PAM security updates on Bookworm
* 07:55 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1199 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89655 and previous config saved to /var/cache/conftool/dbconfig/20260303-075526-marostegui.json
* 07:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1199.eqiad.wmnet with reason: Maintenance
* 07:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89654 and previous config saved to /var/cache/conftool/dbconfig/20260303-075502-marostegui.json
* 07:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P89653 and previous config saved to /var/cache/conftool/dbconfig/20260303-075345-marostegui.json
* 07:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P89652 and previous config saved to /var/cache/conftool/dbconfig/20260303-073955-marostegui.json
* 07:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89651 and previous config saved to /var/cache/conftool/dbconfig/20260303-073838-marostegui.json
* 07:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P89650 and previous config saved to /var/cache/conftool/dbconfig/20260303-072447-marostegui.json
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89649 and previous config saved to /var/cache/conftool/dbconfig/20260303-071054-marostegui.json
* 07:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89648 and previous config saved to /var/cache/conftool/dbconfig/20260303-071029-marostegui.json
* 07:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89647 and previous config saved to /var/cache/conftool/dbconfig/20260303-070940-marostegui.json
* 06:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P89646 and previous config saved to /var/cache/conftool/dbconfig/20260303-065523-marostegui.json
* 06:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1190 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89645 and previous config saved to /var/cache/conftool/dbconfig/20260303-064405-marostegui.json
* 06:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P89644 and previous config saved to /var/cache/conftool/dbconfig/20260303-064015-marostegui.json
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2240 gradually with 4 steps - repool after schema change
* 06:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89642 and previous config saved to /var/cache/conftool/dbconfig/20260303-062507-marostegui.json
* 05:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2147 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89639 and previous config saved to /var/cache/conftool/dbconfig/20260303-055834-marostegui.json
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2147.codfw.wmnet with reason: Maintenance
* 05:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 05:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 05:48 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2240 gradually with 4 steps - repool after schema change
* 05:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 05:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.15 (duration: 01m 10s)
* 04:43 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.18 refs [[phab:T413809|T413809]] (duration: 39m 43s)
* 04:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 03:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 03:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89637 and previous config saved to /var/cache/conftool/dbconfig/20260303-035746-marostegui.json
* 03:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P89636 and previous config saved to /var/cache/conftool/dbconfig/20260303-034239-marostegui.json
* 03:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P89635 and previous config saved to /var/cache/conftool/dbconfig/20260303-032731-marostegui.json
* 03:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89634 and previous config saved to /var/cache/conftool/dbconfig/20260303-031224-marostegui.json
* 03:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1251 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89633 and previous config saved to /var/cache/conftool/dbconfig/20260303-030217-marostegui.json
* 03:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1251.eqiad.wmnet with reason: Maintenance
* 02:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1240.eqiad.wmnet with reason: Maintenance
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 00s)
* 02:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 02:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89632 and previous config saved to /var/cache/conftool/dbconfig/20260303-020817-marostegui.json
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P89631 and previous config saved to /var/cache/conftool/dbconfig/20260303-015309-marostegui.json
* 01:42 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mwlog2003.codfw.wmnet with OS trixie
* 01:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P89630 and previous config saved to /var/cache/conftool/dbconfig/20260303-013802-marostegui.json
* 01:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89629 and previous config saved to /var/cache/conftool/dbconfig/20260303-013719-marostegui.json
* 01:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89628 and previous config saved to /var/cache/conftool/dbconfig/20260303-012254-marostegui.json
* 01:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P89627 and previous config saved to /var/cache/conftool/dbconfig/20260303-012211-marostegui.json
* 01:19 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwlog2003.codfw.wmnet with reason: host reimage
* 01:11 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mwlog2003.codfw.wmnet with reason: host reimage
* 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1235 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89626 and previous config saved to /var/cache/conftool/dbconfig/20260303-011151-marostegui.json
* 01:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1235.eqiad.wmnet with reason: Maintenance
* 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89625 and previous config saved to /var/cache/conftool/dbconfig/20260303-011128-marostegui.json
* 01:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P89624 and previous config saved to /var/cache/conftool/dbconfig/20260303-010703-marostegui.json
* 00:59 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247189{{!}}Revert "ImageListPager: Properly support file schema migration read new"]] (duration: 08m 12s)
* 00:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P89623 and previous config saved to /var/cache/conftool/dbconfig/20260303-005620-marostegui.json
* 00:56 zabe@deploy2002: zabe: Continuing with sync
* 00:54 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog2003.codfw.wmnet with OS trixie
* 00:53 zabe@deploy2002: zabe: Backport for [[gerrit:1247189{{!}}Revert "ImageListPager: Properly support file schema migration read new"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mwlog2003.codfw.wmnet with OS trixie
* 00:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89622 and previous config saved to /var/cache/conftool/dbconfig/20260303-005156-marostegui.json
* 00:51 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1247189{{!}}Revert "ImageListPager: Properly support file schema migration read new"]]
* 00:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P89621 and previous config saved to /var/cache/conftool/dbconfig/20260303-004112-marostegui.json
* 00:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2216 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89620 and previous config saved to /var/cache/conftool/dbconfig/20260303-004056-marostegui.json
* 00:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2216.codfw.wmnet with reason: Maintenance
* 00:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89619 and previous config saved to /var/cache/conftool/dbconfig/20260303-004033-marostegui.json
* 00:31 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mwlog1003.eqiad.wmnet with OS trixie
* 00:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89618 and previous config saved to /var/cache/conftool/dbconfig/20260303-002604-marostegui.json
* 00:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P89617 and previous config saved to /var/cache/conftool/dbconfig/20260303-002525-marostegui.json
* 00:20 zabe@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 00:18 zabe@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 00:18 zabe@deploy2002: Finished scap sync-world: [[phab:T418327|T418327]] (duration: 05m 01s)
* 00:15 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1234 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89616 and previous config saved to /var/cache/conftool/dbconfig/20260303-001504-marostegui.json
* 00:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1234.eqiad.wmnet with reason: Maintenance
* 00:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89615 and previous config saved to /var/cache/conftool/dbconfig/20260303-001440-marostegui.json
* 00:13 zabe@deploy2002: Started scap sync-world: [[phab:T418327|T418327]]
* 00:11 zabe@deploy2002: zabe: Continuing with sync
* 00:10 zabe@deploy2002: zabe: Backport for [[gerrit:1247068{{!}}ImageListPager: Properly support file schema migration read new (T418327)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P89614 and previous config saved to /var/cache/conftool/dbconfig/20260303-001018-marostegui.json
* 00:08 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1247068{{!}}ImageListPager: Properly support file schema migration read new (T418327)]]
== 2026-03-02 ==
* 23:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P89613 and previous config saved to /var/cache/conftool/dbconfig/20260302-235933-marostegui.json
* 23:58 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240320{{!}}Stop writing to il_to on testwiki (T415787)]] (duration: 06m 02s)
* 23:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89612 and previous config saved to /var/cache/conftool/dbconfig/20260302-235511-marostegui.json
* 23:54 zabe@deploy2002: zabe: Continuing with sync
* 23:53 zabe@deploy2002: zabe: Backport for [[gerrit:1240320{{!}}Stop writing to il_to on testwiki (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:52 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1240320{{!}}Stop writing to il_to on testwiki (T415787)]]
* 23:51 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp2058.codfw.wmnet with reason: dcops troubleshooting for [[phab:T418527|T418527]]
* 23:50 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1246880{{!}}multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460)]] (duration: 07m 10s)
* 23:47 zabe@deploy2002: zabe: Continuing with sync
* 23:45 zabe@deploy2002: zabe: Backport for [[gerrit:1246880{{!}}multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P89611 and previous config saved to /var/cache/conftool/dbconfig/20260302-234425-marostegui.json
* 23:44 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog2003.codfw.wmnet with OS trixie
* 23:43 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89610 and previous config saved to /var/cache/conftool/dbconfig/20260302-234350-marostegui.json
* 23:43 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1246880{{!}}multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460)]]
* 23:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2203.codfw.wmnet with reason: Maintenance
* 23:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2202.codfw.wmnet with reason: Maintenance
* 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89609 and previous config saved to /var/cache/conftool/dbconfig/20260302-233517-marostegui.json
* 23:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89608 and previous config saved to /var/cache/conftool/dbconfig/20260302-232918-marostegui.json
* 23:25 dwisehaupt@dns1006: END - running authdns-update
* 23:24 dwisehaupt@dns1006: START - running authdns-update
* 23:23 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwlog1003.eqiad.wmnet with reason: host reimage
* 23:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P89607 and previous config saved to /var/cache/conftool/dbconfig/20260302-232009-marostegui.json
* 23:18 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mwlog1003.eqiad.wmnet with reason: host reimage
* 23:17 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1232 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89606 and previous config saved to /var/cache/conftool/dbconfig/20260302-231723-marostegui.json
* 23:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1232.eqiad.wmnet with reason: Maintenance
* 23:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89605 and previous config saved to /var/cache/conftool/dbconfig/20260302-231658-marostegui.json
* 23:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P89604 and previous config saved to /var/cache/conftool/dbconfig/20260302-230502-marostegui.json
* 23:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P89603 and previous config saved to /var/cache/conftool/dbconfig/20260302-230151-marostegui.json
* 22:57 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog1003.eqiad.wmnet with OS trixie
* 22:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89602 and previous config saved to /var/cache/conftool/dbconfig/20260302-224954-marostegui.json
* 22:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P89601 and previous config saved to /var/cache/conftool/dbconfig/20260302-224643-marostegui.json
* 22:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2188 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89600 and previous config saved to /var/cache/conftool/dbconfig/20260302-223612-marostegui.json
* 22:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2188.codfw.wmnet with reason: Maintenance
* 22:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89599 and previous config saved to /var/cache/conftool/dbconfig/20260302-223548-marostegui.json
* 22:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89598 and previous config saved to /var/cache/conftool/dbconfig/20260302-223135-marostegui.json
* 22:21 maryum: Deployed security fix for [[phab:T418179|T418179]]
* 22:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P89597 and previous config saved to /var/cache/conftool/dbconfig/20260302-222041-marostegui.json
* 22:19 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89596 and previous config saved to /var/cache/conftool/dbconfig/20260302-221938-marostegui.json
* 22:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 22:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89595 and previous config saved to /var/cache/conftool/dbconfig/20260302-221925-marostegui.json
* 22:10 aaron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1242613{{!}}Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470)]] (duration: 06m 39s)
* 22:06 aaron@deploy2002: aaron: Continuing with sync
* 22:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P89594 and previous config saved to /var/cache/conftool/dbconfig/20260302-220533-marostegui.json
* 22:05 aaron@deploy2002: aaron: Backport for [[gerrit:1242613{{!}}Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P89593 and previous config saved to /var/cache/conftool/dbconfig/20260302-220418-marostegui.json
* 22:03 aaron@deploy2002: Started scap sync-world: Backport for [[gerrit:1242613{{!}}Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470)]]
* 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-backup2003.codfw.wmnet with OS trixie
* 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-backup2004.codfw.wmnet with OS trixie
* 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 22:03 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 22:02 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 22:01 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247149{{!}}ApiCSPReport: Use structured logging for CSP reports]] (duration: 08m 19s)
* 21:57 catrope@deploy2002: catrope: Continuing with sync
* 21:56 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 21:55 catrope@deploy2002: catrope: Backport for [[gerrit:1247149{{!}}ApiCSPReport: Use structured logging for CSP reports]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:53 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1247149{{!}}ApiCSPReport: Use structured logging for CSP reports]]
* 21:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89592 and previous config saved to /var/cache/conftool/dbconfig/20260302-215025-marostegui.json
* 21:50 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2043.codfw.wmnet with reason: These are test instances, failing should not notif
* 21:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P89591 and previous config saved to /var/cache/conftool/dbconfig/20260302-214910-marostegui.json
* 21:48 inflatador: bking@desktop restarting wdqs codfw to clear ProbeDown alerts
* 21:43 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cp2043.codfw.wmnet
* 21:42 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup2004.codfw.wmnet with reason: host reimage
* 21:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89590 and previous config saved to /var/cache/conftool/dbconfig/20260302-213957-marostegui.json
* 21:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 21:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89589 and previous config saved to /var/cache/conftool/dbconfig/20260302-213934-marostegui.json
* 21:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup2003.codfw.wmnet with reason: host reimage
* 21:36 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Testing removal of OpenJDK 8 support - eevans@cumin1003
* 21:34 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1226024{{!}}Add Comments namespace for shnwikinews (T414403)]] (duration: 07m 07s)
* 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89588 and previous config saved to /var/cache/conftool/dbconfig/20260302-213402-marostegui.json
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup2004.codfw.wmnet with reason: host reimage
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup2003.codfw.wmnet with reason: host reimage
* 21:30 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp2043.codfw.wmnet
* 21:30 catrope@deploy2002: shivaanshsingh, catrope: Continuing with sync
* 21:29 catrope@deploy2002: shivaanshsingh, catrope: Backport for [[gerrit:1226024{{!}}Add Comments namespace for shnwikinews (T414403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:27 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1226024{{!}}Add Comments namespace for shnwikinews (T414403)]]
* 21:24 kemayo@deploy2002: Finished scap sync-world: Backport for [[gerrit:1243990{{!}}Suggestion Mode: add values for suggestion feedback properties (T401739)]], [[gerrit:1240721{{!}}Stop PasteCheck A/B test (T417429)]] (duration: 10m 55s)
* 21:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P89587 and previous config saved to /var/cache/conftool/dbconfig/20260302-212426-marostegui.json
* 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89586 and previous config saved to /var/cache/conftool/dbconfig/20260302-212345-marostegui.json
* 21:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89585 and previous config saved to /var/cache/conftool/dbconfig/20260302-212321-marostegui.json
* 21:20 kemayo@deploy2002: esanders, kemayo, caro: Continuing with sync
* 21:17 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-backup2004.codfw.wmnet with OS trixie
* 21:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-backup2003.codfw.wmnet with OS trixie
* 21:16 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Testing removal of OpenJDK 8 support - eevans@cumin1003
* 21:16 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ms-backup2003']
* 21:15 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-backup2003']
* 21:15 kemayo@deploy2002: esanders, kemayo, caro: Backport for [[gerrit:1243990{{!}}Suggestion Mode: add values for suggestion feedback properties (T401739)]], [[gerrit:1240721{{!}}Stop PasteCheck A/B test (T417429)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:14 inflatador: bking@apt1002 reprepro --component thirdparty/opensearch3 update trixie-wikimedia [[phab:T418388|T418388]]
* 21:13 kemayo@deploy2002: Started scap sync-world: Backport for [[gerrit:1243990{{!}}Suggestion Mode: add values for suggestion feedback properties (T401739)]], [[gerrit:1240721{{!}}Stop PasteCheck A/B test (T417429)]]
* 21:12 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-backup2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:11 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-backup2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:10 dani@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247107{{!}}Undeploy Comparative Reader Research survey on eswiki (T417834)]], [[gerrit:1247105{{!}}Undeploy Comparative Reader Research survey on enwiki (T417829)]] (duration: 06m 52s)
* 21:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P89584 and previous config saved to /var/cache/conftool/dbconfig/20260302-210919-marostegui.json
* 21:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P89583 and previous config saved to /var/cache/conftool/dbconfig/20260302-210813-marostegui.json
* 21:06 dani@deploy2002: dani: Continuing with sync
* 21:05 dani@deploy2002: dani: Backport for [[gerrit:1247107{{!}}Undeploy Comparative Reader Research survey on eswiki (T417834)]], [[gerrit:1247105{{!}}Undeploy Comparative Reader Research survey on enwiki (T417829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:03 dani@deploy2002: Started scap sync-world: Backport for [[gerrit:1247107{{!}}Undeploy Comparative Reader Research survey on eswiki (T417834)]], [[gerrit:1247105{{!}}Undeploy Comparative Reader Research survey on enwiki (T417829)]]
* 20:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-backup2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-backup2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-backup2004
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-backup2004
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-backup2003
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-backup2003
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-backup2003 to codfw - jhancock@cumin2002"
* 20:54 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-backup2003 to codfw - jhancock@cumin2002"
* 20:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89582 and previous config saved to /var/cache/conftool/dbconfig/20260302-205411-marostegui.json
* 20:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P89581 and previous config saved to /var/cache/conftool/dbconfig/20260302-205307-marostegui.json
* 20:50 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 20:42 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89580 and previous config saved to /var/cache/conftool/dbconfig/20260302-204136-marostegui.json
* 20:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 20:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89579 and previous config saved to /var/cache/conftool/dbconfig/20260302-204112-marostegui.json
* 20:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:39 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89578 and previous config saved to /var/cache/conftool/dbconfig/20260302-203759-marostegui.json
* 20:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:31 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89577 and previous config saved to /var/cache/conftool/dbconfig/20260302-202740-marostegui.json
* 20:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 20:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89576 and previous config saved to /var/cache/conftool/dbconfig/20260302-202716-marostegui.json
* 20:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P89575 and previous config saved to /var/cache/conftool/dbconfig/20260302-202604-marostegui.json
* 20:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P89574 and previous config saved to /var/cache/conftool/dbconfig/20260302-201209-marostegui.json
* 20:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P89573 and previous config saved to /var/cache/conftool/dbconfig/20260302-201057-marostegui.json
* 20:01 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 20:00 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 19:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P89572 and previous config saved to /var/cache/conftool/dbconfig/20260302-195702-marostegui.json
* 19:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89571 and previous config saved to /var/cache/conftool/dbconfig/20260302-195549-marostegui.json
* 19:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89570 and previous config saved to /var/cache/conftool/dbconfig/20260302-194435-marostegui.json
* 19:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 19:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89569 and previous config saved to /var/cache/conftool/dbconfig/20260302-194411-marostegui.json
* 19:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89568 and previous config saved to /var/cache/conftool/dbconfig/20260302-194155-marostegui.json
* 19:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89566 and previous config saved to /var/cache/conftool/dbconfig/20260302-193119-marostegui.json
* 19:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 19:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 19:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89565 and previous config saved to /var/cache/conftool/dbconfig/20260302-193046-marostegui.json
* 19:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P89564 and previous config saved to /var/cache/conftool/dbconfig/20260302-192903-marostegui.json
* 19:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P89563 and previous config saved to /var/cache/conftool/dbconfig/20260302-191539-marostegui.json
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P89562 and previous config saved to /var/cache/conftool/dbconfig/20260302-191355-marostegui.json
* 19:12 dzahn@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:12 dzahn@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:04 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2095.codfw.wmnet with OS bullseye
* 19:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P89561 and previous config saved to /var/cache/conftool/dbconfig/20260302-190032-marostegui.json
* 18:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89560 and previous config saved to /var/cache/conftool/dbconfig/20260302-185848-marostegui.json
* 18:54 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 18:53 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89559 and previous config saved to /var/cache/conftool/dbconfig/20260302-184832-marostegui.json
* 18:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89558 and previous config saved to /var/cache/conftool/dbconfig/20260302-184808-marostegui.json
* 18:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89557 and previous config saved to /var/cache/conftool/dbconfig/20260302-184524-marostegui.json
* 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89556 and previous config saved to /var/cache/conftool/dbconfig/20260302-183449-marostegui.json
* 18:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1195.eqiad.wmnet with reason: Maintenance
* 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89555 and previous config saved to /var/cache/conftool/dbconfig/20260302-183425-marostegui.json
* 18:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P89554 and previous config saved to /var/cache/conftool/dbconfig/20260302-183300-marostegui.json
* 18:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P89553 and previous config saved to /var/cache/conftool/dbconfig/20260302-181918-marostegui.json
* 18:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P89552 and previous config saved to /var/cache/conftool/dbconfig/20260302-181753-marostegui.json
* 18:16 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 18:04 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P89551 and previous config saved to /var/cache/conftool/dbconfig/20260302-180411-marostegui.json
* 18:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89550 and previous config saved to /var/cache/conftool/dbconfig/20260302-180245-marostegui.json
* 18:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:53 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
* 17:53 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
* 17:52 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:52 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89549 and previous config saved to /var/cache/conftool/dbconfig/20260302-174917-marostegui.json
* 17:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 17:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 17:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89548 and previous config saved to /var/cache/conftool/dbconfig/20260302-174903-marostegui.json
* 17:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89547 and previous config saved to /var/cache/conftool/dbconfig/20260302-174854-marostegui.json
* 17:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
* 17:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:42 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:39 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89546 and previous config saved to /var/cache/conftool/dbconfig/20260302-173827-marostegui.json
* 17:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89545 and previous config saved to /var/cache/conftool/dbconfig/20260302-173803-marostegui.json
* 17:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 17:36 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 17:34 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 17:33 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 17:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P89544 and previous config saved to /var/cache/conftool/dbconfig/20260302-173347-marostegui.json
* 17:32 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.update-replication (exit_code=99)
* 17:32 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 17:24 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 17:23 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 17:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P89543 and previous config saved to /var/cache/conftool/dbconfig/20260302-172256-marostegui.json
* 17:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P89542 and previous config saved to /var/cache/conftool/dbconfig/20260302-171839-marostegui.json
* 17:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P89541 and previous config saved to /var/cache/conftool/dbconfig/20260302-170748-marostegui.json
* 17:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89540 and previous config saved to /var/cache/conftool/dbconfig/20260302-170331-marostegui.json
* 16:52 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2230.codfw.wmnet with OS trixie
* 16:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89539 and previous config saved to /var/cache/conftool/dbconfig/20260302-165240-marostegui.json
* 16:51 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2146 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89538 and previous config saved to /var/cache/conftool/dbconfig/20260302-165153-marostegui.json
* 16:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2146.codfw.wmnet with reason: Maintenance
* 16:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89537 and previous config saved to /var/cache/conftool/dbconfig/20260302-165129-marostegui.json
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89536 and previous config saved to /var/cache/conftool/dbconfig/20260302-164141-marostegui.json
* 16:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89535 and previous config saved to /var/cache/conftool/dbconfig/20260302-164118-marostegui.json
* 16:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P89534 and previous config saved to /var/cache/conftool/dbconfig/20260302-163622-marostegui.json
* 16:29 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2230.codfw.wmnet with reason: host reimage
* 16:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P89533 and previous config saved to /var/cache/conftool/dbconfig/20260302-162610-marostegui.json
* 16:21 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2230.codfw.wmnet with reason: host reimage
* 16:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P89532 and previous config saved to /var/cache/conftool/dbconfig/20260302-162115-marostegui.json
* 16:19 moritzm: installing PAM security updates on Bookworm
* 16:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P89531 and previous config saved to /var/cache/conftool/dbconfig/20260302-161102-marostegui.json
* 16:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89530 and previous config saved to /var/cache/conftool/dbconfig/20260302-160607-marostegui.json
* 16:05 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db2230.codfw.wmnet with OS trixie
* 15:56 moritzm: installing glibc bugfix updates from trixie point release
* 15:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89529 and previous config saved to /var/cache/conftool/dbconfig/20260302-155555-marostegui.json
* 15:55 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2145 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89528 and previous config saved to /var/cache/conftool/dbconfig/20260302-155527-marostegui.json
* 15:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2145.codfw.wmnet with reason: Maintenance
* 15:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 15:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1169.eqiad.wmnet
* 15:45 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89527 and previous config saved to /var/cache/conftool/dbconfig/20260302-154520-marostegui.json
* 15:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 15:38 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1169.eqiad.wmnet
* 15:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1167.eqiad.wmnet
* 15:32 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 15:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1209.eqiad.wmnet with reason: Maintenance
* 15:31 marostegui@cumin1003: dbctl commit (dc=all): 'Restore db1226 full weight after schema change', diff saved to https://phabricator.wikimedia.org/P89526 and previous config saved to /var/cache/conftool/dbconfig/20260302-153100-marostegui.json
* 15:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P89525 and previous config saved to /var/cache/conftool/dbconfig/20260302-152334-marostegui.json
* 15:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1167.eqiad.wmnet
* 15:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1166.eqiad.wmnet
* 15:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2198.codfw.wmnet with reason: Maintenance
* 15:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89524 and previous config saved to /var/cache/conftool/dbconfig/20260302-151838-marostegui.json
* 15:10 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1166.eqiad.wmnet
* 15:10 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1165.eqiad.wmnet
* 15:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P89523 and previous config saved to /var/cache/conftool/dbconfig/20260302-150826-marostegui.json
* 15:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P89522 and previous config saved to /var/cache/conftool/dbconfig/20260302-150330-marostegui.json
* 15:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1097.eqiad.wmnet with OS bullseye
* 15:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1165.eqiad.wmnet
* 14:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1164.eqiad.wmnet
* 14:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89520 and previous config saved to /var/cache/conftool/dbconfig/20260302-145318-marostegui.json
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1164.eqiad.wmnet
* 14:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1163.eqiad.wmnet
* 14:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P89519 and previous config saved to /var/cache/conftool/dbconfig/20260302-144823-marostegui.json
* 14:41 Lucas_WMDE: UTC afternoon backport+config window done
* 14:40 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247057{{!}}IPInfo: Set log level to "info" (T374718)]] (duration: 08m 01s)
* 14:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1163.eqiad.wmnet
* 14:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1162.eqiad.wmnet
* 14:36 lucaswerkmeister-wmde@deploy2002: kharlan, lucaswerkmeister-wmde: Continuing with sync
* 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1226 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89517 and previous config saved to /var/cache/conftool/dbconfig/20260302-143608-marostegui.json
* 14:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1226.eqiad.wmnet with reason: Maintenance
* 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89516 and previous config saved to /var/cache/conftool/dbconfig/20260302-143544-marostegui.json
* 14:34 lucaswerkmeister-wmde@deploy2002: kharlan, lucaswerkmeister-wmde: Backport for [[gerrit:1247057{{!}}IPInfo: Set log level to "info" (T374718)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89515 and previous config saved to /var/cache/conftool/dbconfig/20260302-143315-marostegui.json
* 14:32 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [[gerrit:1247057{{!}}IPInfo: Set log level to "info" (T374718)]]
* 14:31 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 14:30 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [[gerrit:1245364{{!}}Add configurations for graphql usage survey and its pipeline tests (T414476)]] (duration: 09m 44s)
* 14:27 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:26 lucaswerkmeister-wmde@deploy2002: itamar, lucaswerkmeister-wmde: Continuing with sync
* 14:26 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 14:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1162.eqiad.wmnet
* 14:25 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1161.eqiad.wmnet
* 14:23 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 14:22 lucaswerkmeister-wmde@deploy2002: itamar, lucaswerkmeister-wmde: Backport for [[gerrit:1245364{{!}}Add configurations for graphql usage survey and its pipeline tests (T414476)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:20 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [[gerrit:1245364{{!}}Add configurations for graphql usage survey and its pipeline tests (T414476)]]
* 14:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P89514 and previous config saved to /var/cache/conftool/dbconfig/20260302-142037-marostegui.json
* 14:19 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:18 lucaswerkmeister-wmde@deploy2002: mwscript-k8s job started: namespaceDupes lawiki --fix # [[phab:T418706|T418706]]
* 14:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89513 and previous config saved to /var/cache/conftool/dbconfig/20260302-141834-marostegui.json
* 14:18 elukey@puppetserver1001: conftool action : set/pooled=no; selector: name=ms-fe1013.eqiad.wmnet
* 14:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2195.codfw.wmnet with reason: Maintenance
* 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
* 14:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89512 and previous config saved to /var/cache/conftool/dbconfig/20260302-141810-marostegui.json
* 14:17 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247063{{!}}lawiki: add Adumbratio (draft) namespace (T418706)]] (duration: 07m 27s)
* 14:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
* 14:13 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, anzx: Continuing with sync
* 14:13 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 14:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1161.eqiad.wmnet
* 14:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1160.eqiad.wmnet
* 14:13 moritzm: installing libcap2 updates from Trixie point release
* 14:12 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, anzx: Backport for [[gerrit:1247063{{!}}lawiki: add Adumbratio (draft) namespace (T418706)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:11 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:11 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 14:10 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:10 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [[gerrit:1247063{{!}}lawiki: add Adumbratio (draft) namespace (T418706)]]
* 14:10 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 14:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1028.eqiad.wmnet
* 14:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 14:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P89511 and previous config saved to /var/cache/conftool/dbconfig/20260302-140529-marostegui.json
* 14:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1097.eqiad.wmnet with reason: host reimage
* 14:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P89510 and previous config saved to /var/cache/conftool/dbconfig/20260302-140302-marostegui.json
* 14:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1028.eqiad.wmnet
* 14:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1160.eqiad.wmnet
* 14:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 14:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1159.eqiad.wmnet
* 14:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1025.eqiad.wmnet
* 13:57 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1097.eqiad.wmnet with reason: host reimage
* 13:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1025.eqiad.wmnet
* 13:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89509 and previous config saved to /var/cache/conftool/dbconfig/20260302-135021-marostegui.json
* 13:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P89508 and previous config saved to /var/cache/conftool/dbconfig/20260302-134754-marostegui.json
* 13:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1159.eqiad.wmnet
* 13:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1158.eqiad.wmnet
* 13:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1097.eqiad.wmnet with OS bullseye
* 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1097
* 13:38 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1097
* 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt ms-be1097 - jclark@cumin1003"
* 13:37 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt ms-be1097 - jclark@cumin1003"
* 13:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1158.eqiad.wmnet
* 13:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1157.eqiad.wmnet
* 13:35 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 13:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1214 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89507 and previous config saved to /var/cache/conftool/dbconfig/20260302-133503-marostegui.json
* 13:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1214.eqiad.wmnet with reason: Maintenance
* 13:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89506 and previous config saved to /var/cache/conftool/dbconfig/20260302-133440-marostegui.json
* 13:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89505 and previous config saved to /var/cache/conftool/dbconfig/20260302-133247-marostegui.json
* 13:28 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 13:27 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:27 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1097
* 13:26 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1097
* 13:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1157.eqiad.wmnet
* 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1156.eqiad.wmnet
* 13:22 brouberol: Running `echo 'https://turnilo-next.wikimedia.org' {{!}} mwscript-k8s --attach -- purgeList.php`
* 13:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P89504 and previous config saved to /var/cache/conftool/dbconfig/20260302-131932-marostegui.json
* 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2181 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89503 and previous config saved to /var/cache/conftool/dbconfig/20260302-131653-marostegui.json
* 13:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2181.codfw.wmnet with reason: Maintenance
* 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89502 and previous config saved to /var/cache/conftool/dbconfig/20260302-131630-marostegui.json
* 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1024.eqiad.wmnet
* 13:14 moritzm: installing libcap2 updates from Bookworm point release
* 13:12 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1156.eqiad.wmnet
* 13:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1155.eqiad.wmnet
* 13:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1024.eqiad.wmnet
* 13:07 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 13:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P89500 and previous config saved to /var/cache/conftool/dbconfig/20260302-130424-marostegui.json
* 13:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P89499 and previous config saved to /var/cache/conftool/dbconfig/20260302-130122-marostegui.json
* 13:00 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 12:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2356.codfw.wmnet
* 12:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2356.codfw.wmnet
* 12:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1155.eqiad.wmnet
* 12:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1154.eqiad.wmnet
* 12:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89498 and previous config saved to /var/cache/conftool/dbconfig/20260302-124917-marostegui.json
* 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1154.eqiad.wmnet
* 12:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1153.eqiad.wmnet
* 12:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P89497 and previous config saved to /var/cache/conftool/dbconfig/20260302-124615-marostegui.json
* 12:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1153.eqiad.wmnet
* 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1152.eqiad.wmnet
* 12:32 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89494 and previous config saved to /var/cache/conftool/dbconfig/20260302-123253-marostegui.json
* 12:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1203.eqiad.wmnet with reason: Maintenance
* 12:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89493 and previous config saved to /var/cache/conftool/dbconfig/20260302-123229-marostegui.json
* 12:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89492 and previous config saved to /var/cache/conftool/dbconfig/20260302-123108-marostegui.json
* 12:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1152.eqiad.wmnet
* 12:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1151.eqiad.wmnet
* 12:23 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P89491 and previous config saved to /var/cache/conftool/dbconfig/20260302-121722-marostegui.json
* 12:15 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89490 and previous config saved to /var/cache/conftool/dbconfig/20260302-121525-marostegui.json
* 12:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 12:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89489 and previous config saved to /var/cache/conftool/dbconfig/20260302-121501-marostegui.json
* 12:12 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1151.eqiad.wmnet
* 12:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1150.eqiad.wmnet
* 12:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P89488 and previous config saved to /var/cache/conftool/dbconfig/20260302-120214-marostegui.json
* 12:00 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1150.eqiad.wmnet
* 11:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P89487 and previous config saved to /var/cache/conftool/dbconfig/20260302-115953-marostegui.json
* 11:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89486 and previous config saved to /var/cache/conftool/dbconfig/20260302-114706-marostegui.json
* 11:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P89485 and previous config saved to /var/cache/conftool/dbconfig/20260302-114446-marostegui.json
* 11:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1193 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89484 and previous config saved to /var/cache/conftool/dbconfig/20260302-113034-marostegui.json
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
* 11:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1193.eqiad.wmnet with reason: Maintenance
* 11:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89483 and previous config saved to /var/cache/conftool/dbconfig/20260302-113010-marostegui.json
* 11:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89482 and previous config saved to /var/cache/conftool/dbconfig/20260302-112937-marostegui.json
* 11:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
* 11:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P89481 and previous config saved to /var/cache/conftool/dbconfig/20260302-111502-marostegui.json
* 11:13 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2166 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89480 and previous config saved to /var/cache/conftool/dbconfig/20260302-111351-marostegui.json
* 11:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2166.codfw.wmnet with reason: Maintenance
* 11:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89479 and previous config saved to /var/cache/conftool/dbconfig/20260302-111327-marostegui.json
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 10:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P89478 and previous config saved to /var/cache/conftool/dbconfig/20260302-105955-marostegui.json
* 10:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P89477 and previous config saved to /var/cache/conftool/dbconfig/20260302-105818-marostegui.json
* 10:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 10:55 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:54 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 10:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 10:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 10:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 10:46 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru and A:cp - 3.0 upgrade ()
* 10:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89476 and previous config saved to /var/cache/conftool/dbconfig/20260302-104446-marostegui.json
* 10:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P89475 and previous config saved to /var/cache/conftool/dbconfig/20260302-104310-marostegui.json
* 10:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89474 and previous config saved to /var/cache/conftool/dbconfig/20260302-102825-marostegui.json
* 10:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1192.eqiad.wmnet with reason: Maintenance
* 10:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89473 and previous config saved to /var/cache/conftool/dbconfig/20260302-102800-marostegui.json
* 10:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P89472 and previous config saved to /var/cache/conftool/dbconfig/20260302-101252-marostegui.json
* 10:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2164 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89471 and previous config saved to /var/cache/conftool/dbconfig/20260302-101200-marostegui.json
* 10:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2164.codfw.wmnet with reason: Maintenance
* 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89470 and previous config saved to /var/cache/conftool/dbconfig/20260302-101135-marostegui.json
* 10:08 moritzm: installing intel-microcode bugfix updates on Bookworm hosts
* 09:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P89469 and previous config saved to /var/cache/conftool/dbconfig/20260302-095744-marostegui.json
* 09:57 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru and A:cp - 3.0 upgrade ()
* 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P89468 and previous config saved to /var/cache/conftool/dbconfig/20260302-095627-marostegui.json
* 09:55 fabfur: start upgrading haproxy to 3.0 on A:cp-text_magru ([[phab:T417253|T417253]])
* 09:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89467 and previous config saved to /var/cache/conftool/dbconfig/20260302-094236-marostegui.json
* 09:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P89466 and previous config saved to /var/cache/conftool/dbconfig/20260302-094118-marostegui.json
* 09:35 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:35 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:34 moritzm: installing gnu TLS security updates
* 09:34 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:33 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89465 and previous config saved to /var/cache/conftool/dbconfig/20260302-092610-marostegui.json
* 09:26 mlitn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1245265{{!}}Limit additional whitespace to sticky header version only (T416598)]] (duration: 11m 02s)
* 09:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89464 and previous config saved to /var/cache/conftool/dbconfig/20260302-092600-marostegui.json
* 09:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1178.eqiad.wmnet with reason: Maintenance
* 09:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89463 and previous config saved to /var/cache/conftool/dbconfig/20260302-092535-marostegui.json
* 09:21 mlitn@deploy2002: mlitn: Continuing with sync
* 09:16 mlitn@deploy2002: mlitn: Backport for [[gerrit:1245265{{!}}Limit additional whitespace to sticky header version only (T416598)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:15 mlitn@deploy2002: Started scap sync-world: Backport for [[gerrit:1245265{{!}}Limit additional whitespace to sticky header version only (T416598)]]
* 09:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P89462 and previous config saved to /var/cache/conftool/dbconfig/20260302-091027-marostegui.json
* 09:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2163 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89461 and previous config saved to /var/cache/conftool/dbconfig/20260302-091003-marostegui.json
* 09:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2163.codfw.wmnet with reason: Maintenance
* 09:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89460 and previous config saved to /var/cache/conftool/dbconfig/20260302-090938-marostegui.json
* 09:08 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1246904{{!}}HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477)]] (duration: 16m 09s)
* 09:02 kharlan@deploy2002: kharlan: Continuing with sync
* 08:57 kharlan@deploy2002: kharlan: Backport for [[gerrit:1246904{{!}}HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P89459 and previous config saved to /var/cache/conftool/dbconfig/20260302-085519-marostegui.json
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P89458 and previous config saved to /var/cache/conftool/dbconfig/20260302-085430-marostegui.json
* 08:51 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1246904{{!}}HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477)]]
* 08:48 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru and A:cp - 3.0 upgrade ()
* 08:47 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1029.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:45 moritzm: installing libxml2 security updates
* 08:44 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240672{{!}}Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485)]] (duration: 37m 12s)
* 08:42 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dbproxy1029.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89457 and previous config saved to /var/cache/conftool/dbconfig/20260302-084010-marostegui.json
* 08:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P89456 and previous config saved to /var/cache/conftool/dbconfig/20260302-083922-marostegui.json
* 08:31 kgraessle@deploy2002: kgraessle: Continuing with sync
* 08:30 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1240672{{!}}Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89455 and previous config saved to /var/cache/conftool/dbconfig/20260302-082414-marostegui.json
* 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1177 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89454 and previous config saved to /var/cache/conftool/dbconfig/20260302-082333-marostegui.json
* 08:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89453 and previous config saved to /var/cache/conftool/dbconfig/20260302-082309-marostegui.json
* 08:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbproxy1028.eqiad.wmnet with reason: Maintenance
* 08:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbproxy1029.eqiad.wmnet with reason: Maintenance
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89452 and previous config saved to /var/cache/conftool/dbconfig/20260302-080813-marostegui.json
* 08:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2161.codfw.wmnet with reason: Maintenance
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P89451 and previous config saved to /var/cache/conftool/dbconfig/20260302-080800-marostegui.json
* 08:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89450 and previous config saved to /var/cache/conftool/dbconfig/20260302-080748-marostegui.json
* 08:07 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1240672{{!}}Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485)]]
* 08:05 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru and A:cp - 3.0 upgrade ()
* 08:05 fabfur: start upgrading haproxy to 3.0 on A:cp-upload_magru ([[phab:T417253|T417253]])
* 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P89449 and previous config saved to /var/cache/conftool/dbconfig/20260302-075252-marostegui.json
* 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P89448 and previous config saved to /var/cache/conftool/dbconfig/20260302-075241-marostegui.json
* 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89447 and previous config saved to /var/cache/conftool/dbconfig/20260302-073745-marostegui.json
* 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P89446 and previous config saved to /var/cache/conftool/dbconfig/20260302-073732-marostegui.json
* 07:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89445 and previous config saved to /var/cache/conftool/dbconfig/20260302-072224-marostegui.json
* 07:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89444 and previous config saved to /var/cache/conftool/dbconfig/20260302-072058-marostegui.json
* 07:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 07:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89443 and previous config saved to /var/cache/conftool/dbconfig/20260302-070523-marostegui.json
* 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2154 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89442 and previous config saved to /var/cache/conftool/dbconfig/20260302-070512-marostegui.json
* 07:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2154.codfw.wmnet with reason: Maintenance
* 07:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89441 and previous config saved to /var/cache/conftool/dbconfig/20260302-070447-marostegui.json
* 07:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1244: After schema change
* 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P89439 and previous config saved to /var/cache/conftool/dbconfig/20260302-065014-marostegui.json
* 06:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P89438 and previous config saved to /var/cache/conftool/dbconfig/20260302-064938-marostegui.json
* 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P89436 and previous config saved to /var/cache/conftool/dbconfig/20260302-063506-marostegui.json
* 06:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P89435 and previous config saved to /var/cache/conftool/dbconfig/20260302-063430-marostegui.json
* 06:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89433 and previous config saved to /var/cache/conftool/dbconfig/20260302-061957-marostegui.json
* 06:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89432 and previous config saved to /var/cache/conftool/dbconfig/20260302-061922-marostegui.json
* 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: Maintenance
* 06:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1244: After schema change
* 06:15 marostegui@dns1004: END - running authdns-update
* 06:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2240 [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89430 and previous config saved to /var/cache/conftool/dbconfig/20260302-061428-marostegui.json
* 06:13 marostegui@dns1004: START - running authdns-update
* 06:13 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2179 to s4 primary and set section read-write [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89429 and previous config saved to /var/cache/conftool/dbconfig/20260302-061316-marostegui.json
* 06:12 marostegui@cumin1003: dbctl commit (dc=all): 'Set s4 codfw as read-only for maintenance - [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89428 and previous config saved to /var/cache/conftool/dbconfig/20260302-061252-marostegui.json
* 06:06 marostegui: Starting s4 codfw failover from db2240 to db2179 - [[phab:T418080|T418080]]
* 06:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 42 hosts with reason: Primary switchover s4 [[phab:T418080|T418080]]
* 06:03 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2179 with weight 0 [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89427 and previous config saved to /var/cache/conftool/dbconfig/20260302-060317-marostegui.json
* 06:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89426 and previous config saved to /var/cache/conftool/dbconfig/20260302-060317-marostegui.json
* 06:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 06:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 06:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2152 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89425 and previous config saved to /var/cache/conftool/dbconfig/20260302-060245-marostegui.json
* 06:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2152.codfw.wmnet with reason: Maintenance
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2220.codfw.wmnet with reason: Maintenance
* 02:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 13s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 00:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 00:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89424 and previous config saved to /var/cache/conftool/dbconfig/20260302-004950-marostegui.json
* 00:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P89423 and previous config saved to /var/cache/conftool/dbconfig/20260302-003441-marostegui.json
* 00:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P89422 and previous config saved to /var/cache/conftool/dbconfig/20260302-001933-marostegui.json
* 00:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89421 and previous config saved to /var/cache/conftool/dbconfig/20260302-000425-marostegui.json
* 00:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1253 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89420 and previous config saved to /var/cache/conftool/dbconfig/20260302-000208-marostegui.json
* 00:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1253.eqiad.wmnet with reason: Maintenance
* 00:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89419 and previous config saved to /var/cache/conftool/dbconfig/20260302-000143-marostegui.json
== 2026-03-01 ==
* 23:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P89418 and previous config saved to /var/cache/conftool/dbconfig/20260301-234635-marostegui.json
* 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89417 and previous config saved to /var/cache/conftool/dbconfig/20260301-233524-marostegui.json
* 23:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P89416 and previous config saved to /var/cache/conftool/dbconfig/20260301-233127-marostegui.json
* 23:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P89415 and previous config saved to /var/cache/conftool/dbconfig/20260301-232016-marostegui.json
* 23:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89414 and previous config saved to /var/cache/conftool/dbconfig/20260301-231619-marostegui.json
* 23:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89413 and previous config saved to /var/cache/conftool/dbconfig/20260301-231404-marostegui.json
* 23:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1236.eqiad.wmnet with reason: Maintenance
* 23:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89412 and previous config saved to /var/cache/conftool/dbconfig/20260301-231339-marostegui.json
* 23:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P89411 and previous config saved to /var/cache/conftool/dbconfig/20260301-230508-marostegui.json
* 22:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P89410 and previous config saved to /var/cache/conftool/dbconfig/20260301-225832-marostegui.json
* 22:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89409 and previous config saved to /var/cache/conftool/dbconfig/20260301-224959-marostegui.json
* 22:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2222 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89408 and previous config saved to /var/cache/conftool/dbconfig/20260301-224451-marostegui.json
* 22:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2222.codfw.wmnet with reason: Maintenance
* 22:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89407 and previous config saved to /var/cache/conftool/dbconfig/20260301-224426-marostegui.json
* 22:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P89406 and previous config saved to /var/cache/conftool/dbconfig/20260301-224324-marostegui.json
* 22:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P89405 and previous config saved to /var/cache/conftool/dbconfig/20260301-222919-marostegui.json
* 22:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89404 and previous config saved to /var/cache/conftool/dbconfig/20260301-222815-marostegui.json
* 22:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1231 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89403 and previous config saved to /var/cache/conftool/dbconfig/20260301-222600-marostegui.json
* 22:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1231.eqiad.wmnet with reason: Maintenance
* 22:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89402 and previous config saved to /var/cache/conftool/dbconfig/20260301-222536-marostegui.json
* 22:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P89401 and previous config saved to /var/cache/conftool/dbconfig/20260301-221410-marostegui.json
* 22:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P89400 and previous config saved to /var/cache/conftool/dbconfig/20260301-221027-marostegui.json
* 21:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89399 and previous config saved to /var/cache/conftool/dbconfig/20260301-215902-marostegui.json
* 21:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P89398 and previous config saved to /var/cache/conftool/dbconfig/20260301-215519-marostegui.json
* 21:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89397 and previous config saved to /var/cache/conftool/dbconfig/20260301-215404-marostegui.json
* 21:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2221.codfw.wmnet with reason: Maintenance
* 21:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89396 and previous config saved to /var/cache/conftool/dbconfig/20260301-215339-marostegui.json
* 21:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89395 and previous config saved to /var/cache/conftool/dbconfig/20260301-214011-marostegui.json
* 21:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P89394 and previous config saved to /var/cache/conftool/dbconfig/20260301-213831-marostegui.json
* 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89393 and previous config saved to /var/cache/conftool/dbconfig/20260301-213410-marostegui.json
* 21:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 21:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89392 and previous config saved to /var/cache/conftool/dbconfig/20260301-213346-marostegui.json
* 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P89391 and previous config saved to /var/cache/conftool/dbconfig/20260301-212323-marostegui.json
* 21:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P89390 and previous config saved to /var/cache/conftool/dbconfig/20260301-211837-marostegui.json
* 21:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89389 and previous config saved to /var/cache/conftool/dbconfig/20260301-210815-marostegui.json
* 21:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P89388 and previous config saved to /var/cache/conftool/dbconfig/20260301-210329-marostegui.json
* 21:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89387 and previous config saved to /var/cache/conftool/dbconfig/20260301-210309-marostegui.json
* 21:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2218.codfw.wmnet with reason: Maintenance
* 21:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89386 and previous config saved to /var/cache/conftool/dbconfig/20260301-210244-marostegui.json
* 20:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89385 and previous config saved to /var/cache/conftool/dbconfig/20260301-204820-marostegui.json
* 20:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P89384 and previous config saved to /var/cache/conftool/dbconfig/20260301-204736-marostegui.json
* 20:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1202 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89383 and previous config saved to /var/cache/conftool/dbconfig/20260301-204606-marostegui.json
* 20:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 20:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89382 and previous config saved to /var/cache/conftool/dbconfig/20260301-204541-marostegui.json
* 20:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P89381 and previous config saved to /var/cache/conftool/dbconfig/20260301-203227-marostegui.json
* 20:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P89380 and previous config saved to /var/cache/conftool/dbconfig/20260301-203033-marostegui.json
* 20:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89379 and previous config saved to /var/cache/conftool/dbconfig/20260301-201720-marostegui.json
* 20:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P89378 and previous config saved to /var/cache/conftool/dbconfig/20260301-201525-marostegui.json
* 20:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2208 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89377 and previous config saved to /var/cache/conftool/dbconfig/20260301-201212-marostegui.json
* 20:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Maintenance
* 20:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2200.codfw.wmnet with reason: Maintenance
* 20:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2198.codfw.wmnet with reason: Maintenance
* 20:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89376 and previous config saved to /var/cache/conftool/dbconfig/20260301-200422-marostegui.json
* 20:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89375 and previous config saved to /var/cache/conftool/dbconfig/20260301-200016-marostegui.json
* 19:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1194 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89374 and previous config saved to /var/cache/conftool/dbconfig/20260301-195803-marostegui.json
* 19:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 19:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89373 and previous config saved to /var/cache/conftool/dbconfig/20260301-195738-marostegui.json
* 19:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P89372 and previous config saved to /var/cache/conftool/dbconfig/20260301-194914-marostegui.json
* 19:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P89371 and previous config saved to /var/cache/conftool/dbconfig/20260301-194230-marostegui.json
* 19:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P89370 and previous config saved to /var/cache/conftool/dbconfig/20260301-193406-marostegui.json
* 19:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P89369 and previous config saved to /var/cache/conftool/dbconfig/20260301-192721-marostegui.json
* 19:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89368 and previous config saved to /var/cache/conftool/dbconfig/20260301-191858-marostegui.json
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89367 and previous config saved to /var/cache/conftool/dbconfig/20260301-191340-marostegui.json
* 19:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89366 and previous config saved to /var/cache/conftool/dbconfig/20260301-191315-marostegui.json
* 19:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89365 and previous config saved to /var/cache/conftool/dbconfig/20260301-191213-marostegui.json
* 19:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1191 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89364 and previous config saved to /var/cache/conftool/dbconfig/20260301-190958-marostegui.json
* 19:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 19:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89363 and previous config saved to /var/cache/conftool/dbconfig/20260301-190934-marostegui.json
* 18:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P89362 and previous config saved to /var/cache/conftool/dbconfig/20260301-185807-marostegui.json
* 18:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P89361 and previous config saved to /var/cache/conftool/dbconfig/20260301-185425-marostegui.json
* 18:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P89360 and previous config saved to /var/cache/conftool/dbconfig/20260301-184259-marostegui.json
* 18:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P89359 and previous config saved to /var/cache/conftool/dbconfig/20260301-183917-marostegui.json
* 18:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89358 and previous config saved to /var/cache/conftool/dbconfig/20260301-182750-marostegui.json
* 18:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89357 and previous config saved to /var/cache/conftool/dbconfig/20260301-182409-marostegui.json
* 18:22 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2168 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89356 and previous config saved to /var/cache/conftool/dbconfig/20260301-182238-marostegui.json
* 18:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 18:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89355 and previous config saved to /var/cache/conftool/dbconfig/20260301-182213-marostegui.json
* 18:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89354 and previous config saved to /var/cache/conftool/dbconfig/20260301-182153-marostegui.json
* 18:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 18:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 18:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89353 and previous config saved to /var/cache/conftool/dbconfig/20260301-181818-marostegui.json
* 18:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P89352 and previous config saved to /var/cache/conftool/dbconfig/20260301-180705-marostegui.json
* 18:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P89351 and previous config saved to /var/cache/conftool/dbconfig/20260301-180310-marostegui.json
* 17:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P89350 and previous config saved to /var/cache/conftool/dbconfig/20260301-175157-marostegui.json
* 17:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P89349 and previous config saved to /var/cache/conftool/dbconfig/20260301-174802-marostegui.json
* 17:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89348 and previous config saved to /var/cache/conftool/dbconfig/20260301-173649-marostegui.json
* 17:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89347 and previous config saved to /var/cache/conftool/dbconfig/20260301-173253-marostegui.json
* 17:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89346 and previous config saved to /var/cache/conftool/dbconfig/20260301-173134-marostegui.json
* 17:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 17:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89345 and previous config saved to /var/cache/conftool/dbconfig/20260301-173110-marostegui.json
* 17:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89344 and previous config saved to /var/cache/conftool/dbconfig/20260301-172742-marostegui.json
* 17:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 17:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89343 and previous config saved to /var/cache/conftool/dbconfig/20260301-172717-marostegui.json
* 17:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P89342 and previous config saved to /var/cache/conftool/dbconfig/20260301-171602-marostegui.json
* 17:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P89341 and previous config saved to /var/cache/conftool/dbconfig/20260301-171210-marostegui.json
* 17:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P89340 and previous config saved to /var/cache/conftool/dbconfig/20260301-170053-marostegui.json
* 16:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P89339 and previous config saved to /var/cache/conftool/dbconfig/20260301-165701-marostegui.json
* 16:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89338 and previous config saved to /var/cache/conftool/dbconfig/20260301-164545-marostegui.json
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89337 and previous config saved to /var/cache/conftool/dbconfig/20260301-164153-marostegui.json
* 16:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2150 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89336 and previous config saved to /var/cache/conftool/dbconfig/20260301-164022-marostegui.json
* 16:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2150.codfw.wmnet with reason: Maintenance
* 16:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89335 and previous config saved to /var/cache/conftool/dbconfig/20260301-163938-marostegui.json
* 16:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 16:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 16:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1230.eqiad.wmnet with reason: Maintenance
* 16:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 12:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89334 and previous config saved to /var/cache/conftool/dbconfig/20260301-122201-marostegui.json
* 12:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P89333 and previous config saved to /var/cache/conftool/dbconfig/20260301-120652-marostegui.json
* 11:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P89332 and previous config saved to /var/cache/conftool/dbconfig/20260301-115144-marostegui.json
* 11:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89331 and previous config saved to /var/cache/conftool/dbconfig/20260301-113636-marostegui.json
* 11:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2228 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89330 and previous config saved to /var/cache/conftool/dbconfig/20260301-113156-marostegui.json
* 11:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2228.codfw.wmnet with reason: Maintenance
* 11:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89329 and previous config saved to /var/cache/conftool/dbconfig/20260301-113131-marostegui.json
* 11:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 11:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 11:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1216.eqiad.wmnet with reason: Maintenance
* 11:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89328 and previous config saved to /var/cache/conftool/dbconfig/20260301-111658-marostegui.json
* 11:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P89327 and previous config saved to /var/cache/conftool/dbconfig/20260301-111622-marostegui.json
* 11:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P89326 and previous config saved to /var/cache/conftool/dbconfig/20260301-110151-marostegui.json
* 11:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P89325 and previous config saved to /var/cache/conftool/dbconfig/20260301-110114-marostegui.json
* 10:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P89324 and previous config saved to /var/cache/conftool/dbconfig/20260301-104642-marostegui.json
* 10:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89323 and previous config saved to /var/cache/conftool/dbconfig/20260301-104606-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2223 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89322 and previous config saved to /var/cache/conftool/dbconfig/20260301-104024-marostegui.json
* 10:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2223.codfw.wmnet with reason: Maintenance
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89321 and previous config saved to /var/cache/conftool/dbconfig/20260301-103958-marostegui.json
* 10:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89320 and previous config saved to /var/cache/conftool/dbconfig/20260301-103134-marostegui.json
* 10:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89319 and previous config saved to /var/cache/conftool/dbconfig/20260301-102727-marostegui.json
* 10:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1210.eqiad.wmnet with reason: Maintenance
* 10:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89318 and previous config saved to /var/cache/conftool/dbconfig/20260301-102702-marostegui.json
* 10:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P89317 and previous config saved to /var/cache/conftool/dbconfig/20260301-102450-marostegui.json
* 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P89316 and previous config saved to /var/cache/conftool/dbconfig/20260301-101154-marostegui.json
* 10:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P89315 and previous config saved to /var/cache/conftool/dbconfig/20260301-100942-marostegui.json
* 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P89314 and previous config saved to /var/cache/conftool/dbconfig/20260301-095645-marostegui.json
* 09:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89313 and previous config saved to /var/cache/conftool/dbconfig/20260301-095434-marostegui.json
* 09:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2211 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89312 and previous config saved to /var/cache/conftool/dbconfig/20260301-094847-marostegui.json
* 09:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2211.codfw.wmnet with reason: Maintenance
* 09:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2201.codfw.wmnet with reason: Maintenance
* 09:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89311 and previous config saved to /var/cache/conftool/dbconfig/20260301-094432-marostegui.json
* 09:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89310 and previous config saved to /var/cache/conftool/dbconfig/20260301-094137-marostegui.json
* 09:38 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1207 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89309 and previous config saved to /var/cache/conftool/dbconfig/20260301-093835-marostegui.json
* 09:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1207.eqiad.wmnet with reason: Maintenance
* 09:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89308 and previous config saved to /var/cache/conftool/dbconfig/20260301-093810-marostegui.json
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P89307 and previous config saved to /var/cache/conftool/dbconfig/20260301-092923-marostegui.json
* 09:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P89306 and previous config saved to /var/cache/conftool/dbconfig/20260301-092302-marostegui.json
* 09:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P89305 and previous config saved to /var/cache/conftool/dbconfig/20260301-091415-marostegui.json
* 09:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P89304 and previous config saved to /var/cache/conftool/dbconfig/20260301-090754-marostegui.json
* 08:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89303 and previous config saved to /var/cache/conftool/dbconfig/20260301-085907-marostegui.json
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89302 and previous config saved to /var/cache/conftool/dbconfig/20260301-085427-marostegui.json
* 08:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2192.codfw.wmnet with reason: Maintenance
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89301 and previous config saved to /var/cache/conftool/dbconfig/20260301-085403-marostegui.json
* 08:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89300 and previous config saved to /var/cache/conftool/dbconfig/20260301-085246-marostegui.json
* 08:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1200 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89299 and previous config saved to /var/cache/conftool/dbconfig/20260301-084952-marostegui.json
* 08:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1200.eqiad.wmnet with reason: Maintenance
* 08:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89298 and previous config saved to /var/cache/conftool/dbconfig/20260301-084928-marostegui.json
* 08:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P89297 and previous config saved to /var/cache/conftool/dbconfig/20260301-083855-marostegui.json
* 08:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P89296 and previous config saved to /var/cache/conftool/dbconfig/20260301-083420-marostegui.json
* 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P89295 and previous config saved to /var/cache/conftool/dbconfig/20260301-082346-marostegui.json
* 08:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P89294 and previous config saved to /var/cache/conftool/dbconfig/20260301-081912-marostegui.json
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89293 and previous config saved to /var/cache/conftool/dbconfig/20260301-080838-marostegui.json
* 08:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89292 and previous config saved to /var/cache/conftool/dbconfig/20260301-080404-marostegui.json
* 08:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 08:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89291 and previous config saved to /var/cache/conftool/dbconfig/20260301-080341-marostegui.json
* 08:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1185 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89290 and previous config saved to /var/cache/conftool/dbconfig/20260301-080110-marostegui.json
* 08:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1185.eqiad.wmnet with reason: Maintenance
* 08:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89289 and previous config saved to /var/cache/conftool/dbconfig/20260301-080044-marostegui.json
* 07:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P89288 and previous config saved to /var/cache/conftool/dbconfig/20260301-074833-marostegui.json
* 07:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P89287 and previous config saved to /var/cache/conftool/dbconfig/20260301-074536-marostegui.json
* 07:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P89286 and previous config saved to /var/cache/conftool/dbconfig/20260301-073324-marostegui.json
* 07:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P89285 and previous config saved to /var/cache/conftool/dbconfig/20260301-073028-marostegui.json
* 07:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89284 and previous config saved to /var/cache/conftool/dbconfig/20260301-071816-marostegui.json
* 07:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89283 and previous config saved to /var/cache/conftool/dbconfig/20260301-071521-marostegui.json
* 07:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2171 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89282 and previous config saved to /var/cache/conftool/dbconfig/20260301-071226-marostegui.json
* 07:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 07:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89281 and previous config saved to /var/cache/conftool/dbconfig/20260301-071201-marostegui.json
* 07:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89280 and previous config saved to /var/cache/conftool/dbconfig/20260301-071113-marostegui.json
* 07:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89279 and previous config saved to /var/cache/conftool/dbconfig/20260301-071040-marostegui.json
* 06:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P89278 and previous config saved to /var/cache/conftool/dbconfig/20260301-065653-marostegui.json
* 06:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P89277 and previous config saved to /var/cache/conftool/dbconfig/20260301-065531-marostegui.json
* 06:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P89276 and previous config saved to /var/cache/conftool/dbconfig/20260301-064145-marostegui.json
* 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P89275 and previous config saved to /var/cache/conftool/dbconfig/20260301-064023-marostegui.json
* 06:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89274 and previous config saved to /var/cache/conftool/dbconfig/20260301-062636-marostegui.json
* 06:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89273 and previous config saved to /var/cache/conftool/dbconfig/20260301-062515-marostegui.json
* 06:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89272 and previous config saved to /var/cache/conftool/dbconfig/20260301-062108-marostegui.json
* 06:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1159.eqiad.wmnet with reason: Maintenance
* 06:20 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89271 and previous config saved to /var/cache/conftool/dbconfig/20260301-062047-marostegui.json
* 06:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
* 02:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 00s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
fdo6zwjl7ip4xoszr7e2lz4qr799u6d
2396605
2396604
2026-03-28T14:48:43Z
Stashbot
7414
dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul2002.codfw.wmnet with reason: T421398
2396605
wikitext
text/x-wiki
== 2026-03-28 ==
* 14:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 14:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 14:16 mutante: releases1003 - re-enabled puppet which was disabled due to [[phab:T418109|T418109]] but should not have been disabled during switch of the deployment server; leading to [[phab:T421532|T421532]]
== 2026-03-27 ==
* 18:11 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 18:00 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:50 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:40 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:39 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:39 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:39 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 17:38 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:37 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 17:37 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:35 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 17:34 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 17:34 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:30 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:30 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:24 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:19 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:15 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:04 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:55 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:50 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:47 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:42 dancy@deploy1003: Finished deploy [releng/jenkins-deploy@31ace7e] (releasing): (no justification provided) (duration: 01m 18s)
* 16:41 dancy@deploy1003: Started deploy [releng/jenkins-deploy@31ace7e] (releasing): (no justification provided)
* 16:37 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:36 jhathaway@cumin2002: END (ERROR) - Cookbook sre.hosts.provision (exit_code=97) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:27 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:22 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:13 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 16:12 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 16:12 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:11 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:10 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 15:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:00 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:09 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:09 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change ips for frack servers - cmooney@cumin1003"
* 14:08 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change ips for frack servers - cmooney@cumin1003"
* 14:02 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 13:52 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:51 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:49 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:49 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:48 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:47 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:47 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:40 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:11 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:08 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:06 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 11:53 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 11:51 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 11:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 11:48 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 11:30 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:27 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:15 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-test1006.eqiad.wmnet with OS trixie
* 11:15 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database abstractwiki ([[phab:T420637|T420637]])
* 11:02 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2002.codfw.wmnet
* 10:54 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2006.codfw.wmnet
* 10:51 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 10:50 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2006.codfw.wmnet
* 10:46 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2005.codfw.wmnet
* 10:43 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2005.codfw.wmnet
* 10:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1006.eqiad.wmnet with reason: host reimage
* 10:27 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1006.eqiad.wmnet with reason: host reimage
* 10:18 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database abstractwiki ([[phab:T420637|T420637]])
* 10:12 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1006.eqiad.wmnet with OS trixie
* 10:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 10:03 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:01 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:58 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:57 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
* 09:37 elukey@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
* 09:06 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 09:05 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:04 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:03 elukey@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 08:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:05 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 08:04 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 08:02 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 07:46 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 03:06 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:32 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:12 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 07s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:30 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 01:30 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul1001.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 01:30 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul2001.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 01:29 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 01:12 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
== 2026-03-26 ==
* 21:35 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1260834{{!}}Add Logstash logging for successful passwordless logins]], [[gerrit:1261511{{!}}InitialiseSettings: Remove apiportalwiki from $wmgCentralAuthAutoLoginWikis (T421413)]] (duration: 06m 58s)
* 21:31 reedy@deploy1003: catrope, reedy: Continuing with sync
* 21:30 reedy@deploy1003: catrope, reedy: Backport for [[gerrit:1260834{{!}}Add Logstash logging for successful passwordless logins]], [[gerrit:1261511{{!}}InitialiseSettings: Remove apiportalwiki from $wmgCentralAuthAutoLoginWikis (T421413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:28 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1260834{{!}}Add Logstash logging for successful passwordless logins]], [[gerrit:1261511{{!}}InitialiseSettings: Remove apiportalwiki from $wmgCentralAuthAutoLoginWikis (T421413)]]
* 21:00 suecarmol@deploy1003: Finished scap sync-world: Backport for [[gerrit:1256498{{!}}PersonalDashboard: Add config for Active Discussions (T420785)]] (duration: 13m 53s)
* 20:54 suecarmol@deploy1003: suecarmol: Continuing with sync
* 20:51 suecarmol@deploy1003: suecarmol: Backport for [[gerrit:1256498{{!}}PersonalDashboard: Add config for Active Discussions (T420785)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:46 suecarmol@deploy1003: Started scap sync-world: Backport for [[gerrit:1256498{{!}}PersonalDashboard: Add config for Active Discussions (T420785)]]
* 20:44 kamila@deploy1003: Finished scap sync-world: Backport for [[gerrit:1261545{{!}}Wrap 'centralauthtoken' in a JWT (T420280)]], [[gerrit:1261470{{!}}Enable $wgTempCategoryCollations for testwiki. (T419274 T419049)]] (duration: 37m 32s)
* 20:30 kamila@deploy1003: matmarex, kamila: Continuing with sync
* 20:25 kamila@deploy1003: matmarex, kamila: Backport for [[gerrit:1261545{{!}}Wrap 'centralauthtoken' in a JWT (T420280)]], [[gerrit:1261470{{!}}Enable $wgTempCategoryCollations for testwiki. (T419274 T419049)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host restbase2039.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:09 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host restbase2039.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>aqs1015.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 20:08 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host restbase2039
* 20:07 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host restbase2039
* 20:06 kamila@deploy1003: Started scap sync-world: Backport for [[gerrit:1261545{{!}}Wrap 'centralauthtoken' in a JWT (T420280)]], [[gerrit:1261470{{!}}Enable $wgTempCategoryCollations for testwiki. (T419274 T419049)]]
* 20:05 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:05 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding restbase2039 to codfw - jhancock@cumin2002"
* 20:05 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding restbase2039 to codfw - jhancock@cumin2002"
* 20:02 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 20:00 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>aqs1015.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:47 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 19:44 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 18:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:48 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:42 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
* 18:42 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/zotero: apply
* 18:42 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
* 18:41 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
* 18:41 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
* 18:40 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/toolhub: apply
* 18:40 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
* 18:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:39 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/termbox: apply
* 18:39 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 18:37 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 18:37 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 18:37 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 18:36 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 18:36 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:36 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 18:36 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
* 18:35 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
* 18:35 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/recommendation-api: apply
* 18:35 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/recommendation-api: apply
* 18:34 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
* 18:34 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
* 18:33 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/push-notifications: apply
* 18:33 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/push-notifications: apply
* 18:32 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 18:31 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 18:31 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 18:28 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 18:28 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:28 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:27 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
* 18:25 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>sessionstore1006.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 18:21 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
* 18:21 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 18:20 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 18:20 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
* 18:19 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/ipoid: apply
* 18:19 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/image-suggestion: apply
* 18:19 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/image-suggestion: apply
* 18:18 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>sessionstore1006.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 18:18 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
* 18:17 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
* 18:17 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
* 18:16 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
* 18:15 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 18:14 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 18:13 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 18:13 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 18:13 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
* 18:12 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
* 18:12 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:12 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
* 18:10 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
* 18:09 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
* 18:09 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/echostore: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/echostore: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 18:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 18:06 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/commons-impact-analytics: apply
* 18:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/commons-impact-analytics: apply
* 18:04 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 18:04 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 18:03 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 18:03 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 18:03 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
* 18:02 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
* 17:59 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/apertium: apply
* 17:58 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/apertium: apply
* 17:55 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to enable envoy drain on remaining services - [[phab:T364245|T364245]] (duration: 05m 31s)
* 17:52 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to enable envoy drain on remaining services - [[phab:T364245|T364245]]
* 17:35 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:39 rzl@deploy1003: Finished scap sync-world: https://gerrit.wikimedia.org/r/1256396 [[phab:T420666|T420666]] (duration: 11m 21s)
* 16:35 rzl@deploy1003: rzl: Continuing with sync
* 16:34 rzl@deploy1003: rzl: https://gerrit.wikimedia.org/r/1256396 [[phab:T420666|T420666]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:31 rzl@deploy1003: Started scap sync-world: https://gerrit.wikimedia.org/r/1256396 [[phab:T420666|T420666]]
* 16:27 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 16:17 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 16:17 blake@deploy1003: Finished scap sync-world: Test deployment to validate deployment server switchover - [[phab:T413974|T413974]] (duration: 31m 09s)
* 16:16 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 16:05 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 15:47 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1202.eqiad.wmnet onto db1253.eqiad.wmnet
* 15:47 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1253: Pool db1253.eqiad.wmnet in after cloning
* 15:46 blake@deploy1003: Started scap sync-world: Test deployment to validate deployment server switchover - [[phab:T413974|T413974]]
* 15:44 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 15:43 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 15:43 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 15:33 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 15:30 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
* 15:30 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
* 15:23 blake@dns1004: END - running authdns-update
* 15:22 bjensen: updating dns for the deployment host switchover
* 15:21 blake@dns1004: START - running authdns-update
* 15:19 blake@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet,releases1003.eqiad.wmnet with reason: Deployment server switchover
* 15:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:01 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1253: Pool db1253.eqiad.wmnet in after cloning
* 14:39 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:28 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:23 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:22 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:21 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:21 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:21 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:20 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 14:20 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 14:20 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 14:19 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 14:18 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 14:18 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 14:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1202: Pool db1202.eqiad.wmnet in after cloning
* 13:57 jynus: dropping ms-backup[12]00[12] grants from backup1-* dbs [[phab:T420464|T420464]]
* 13:56 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1070.eqiad.wmnet
* 13:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1070.eqiad.wmnet
* 13:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1097.eqiad.wmnet
* 13:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1097.eqiad.wmnet
* 13:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1055.eqiad.wmnet
* 13:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1055.eqiad.wmnet
* 13:46 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:45 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:40 sergi0: UTC afternoon backport window done
* 13:39 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259132{{!}}GrowthExperiments: scale edit and thanks query limit to more wikis (T341599)]] (duration: 09m 17s)
* 13:35 sgimeno@deploy2002: sgimeno: Continuing with sync
* 13:32 sgimeno@deploy2002: sgimeno: Backport for [[gerrit:1259132{{!}}GrowthExperiments: scale edit and thanks query limit to more wikis (T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1259132{{!}}GrowthExperiments: scale edit and thanks query limit to more wikis (T341599)]]
* 13:26 jforrester@deploy2002: Finished deploy [integration/docroot@f021d3f]: {{Gerrit|Ia936ecd68e675cff2925dba933e3b67b9bad4cd6}} (duration: 00m 11s)
* 13:26 jforrester@deploy2002: Started deploy [integration/docroot@f021d3f]: {{Gerrit|Ia936ecd68e675cff2925dba933e3b67b9bad4cd6}}
* 13:24 kamila@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256384{{!}}Temporarily add shellbox-icu to $wgShellboxUrls (T419049 T419242 T419274)]] (duration: 07m 16s)
* 13:20 kamila@deploy2002: kamila: Continuing with sync
* 13:19 kamila@deploy2002: kamila: Backport for [[gerrit:1256384{{!}}Temporarily add shellbox-icu to $wgShellboxUrls (T419049 T419242 T419274)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:17 kamila@deploy2002: Started scap sync-world: Backport for [[gerrit:1256384{{!}}Temporarily add shellbox-icu to $wgShellboxUrls (T419049 T419242 T419274)]]
* 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1202: Pool db1202.eqiad.wmnet in after cloning
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 13:13 kamila@deploy2002: Finished scap sync-world: Backport for [[gerrit:1261420{{!}}cswiki: lift IP cap for editathon (T421305)]] (duration: 07m 22s)
* 13:12 btullis@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 13:09 kamila@deploy2002: kamila, anzx: Continuing with sync
* 13:08 jynus: deploying new grants for new ms-backup hosts and removing old ones [[phab:T420464|T420464]]
* 13:08 kamila@deploy2002: kamila, anzx: Backport for [[gerrit:1261420{{!}}cswiki: lift IP cap for editathon (T421305)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:06 kamila@deploy2002: Started scap sync-world: Backport for [[gerrit:1261420{{!}}cswiki: lift IP cap for editathon (T421305)]]
* 13:03 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:43 cdanis: puppet reenabled on drmrs, CIDERGRINDER deployed
* 12:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:23 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:12 cdanis: 💔cdanis@cumin1003.eqiad.wmnet ~ 🕗☕ sudo cumin 'A:cp-drmrs' 'disable-puppet "cdanis CIDER"'
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1004.eqiad.wmnet
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1006.eqiad.wmnet
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1003.eqiad.wmnet
* 12:02 elukey@cumin1003: END (PASS) - Cookbook sre.postgresql.postgres-init (exit_code=0)
* 12:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1002.eqiad.wmnet
* 12:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1005.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1006.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1005.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1004.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1003.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1002.eqiad.wmnet
* 11:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1001.eqiad.wmnet
* 11:44 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:43 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:43 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:41 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:41 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 11:41 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 11:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1001.eqiad.wmnet
* 11:38 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:37 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 11:33 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1202: Depool db1202.eqiad.wmnet to then clone it to db1253.eqiad.wmnet - fceratto@cumin1003
* 11:31 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 11:31 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1202: Depool db1202.eqiad.wmnet to then clone it to db1253.eqiad.wmnet - fceratto@cumin1003
* 11:31 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db1202.eqiad.wmnet onto db1253.eqiad.wmnet
* 11:31 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 11:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1001.eqiad.wmnet
* 11:22 elukey@cumin1003: START - Cookbook sre.postgresql.postgres-init
* 11:22 elukey@cumin1003: END (FAIL) - Cookbook sre.postgresql.postgres-init (exit_code=99)
* 11:22 elukey@cumin1003: START - Cookbook sre.postgresql.postgres-init
* 11:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1001.eqiad.wmnet
* 11:15 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
* 11:14 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
* 11:14 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 11:13 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 11:07 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 11:04 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260802{{!}}SI: Enable on bnwiki, itwiki, simplewiki, and plwiki (T415529)]] (duration: 09m 23s)
* 10:59 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 10:56 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1260802{{!}}SI: Enable on bnwiki, itwiki, simplewiki, and plwiki (T415529)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:54 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1260802{{!}}SI: Enable on bnwiki, itwiki, simplewiki, and plwiki (T415529)]]
* 10:33 oblivian@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: sync
* 10:32 oblivian@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: sync
* 10:32 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 10:32 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:23 tappof@cumin1003: END (PASS) - Cookbook sre.o11y.thanos-compact-restart (exit_code=0)
* 10:23 tappof@cumin1003: START - Cookbook sre.o11y.thanos-compact-restart
* 10:22 tappof@cumin1003: END (PASS) - Cookbook sre.o11y.thanos-compact-restart (exit_code=0)
* 10:22 tappof@cumin1003: START - Cookbook sre.o11y.thanos-compact-restart
* 10:12 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s1
* 10:11 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s1
* 10:05 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s4
* 10:05 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s4
* 09:58 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s8
* 09:58 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s8
* 09:53 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 09:53 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:52 hashar: Starting Gerrit on the replica / gerrit1003
* 09:51 hashar: Stopping Gerrit on the replica / gerrit1003 to clear web sessions
* 09:51 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s7
* 09:50 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s7
* 09:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 09:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 09:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 09:46 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 09:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 09:44 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s3
* 09:43 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s3
* 09:42 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 09:36 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s2
* 09:36 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s2
* 09:31 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:29 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s5
* 09:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:29 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s5
* 09:26 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 09:22 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s6
* 09:22 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 09:22 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s6
* 09:18 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:16 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section es6
* 09:15 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section es6
* 09:13 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 09:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 09:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 09:08 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section x3
* 09:07 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section x3
* 09:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:02 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section x1
* 09:01 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section x1
* 09:01 federico3: starting [[phab:T416708|T416708]] - disabling circular replication on core dbs
* 08:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 08:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 08:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 08:43 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:41 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:32 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 08:27 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 08:18 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 08:11 hashar@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 05:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Downgrade clouddb1022 to 10.11.13
== 2026-03-25 ==
* 23:59 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul2001.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 23:58 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul1001.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 23:29 mutante: zuul1001 - installed mariadb-client - connected once to zuul db on m1-master; mysql> truncate "alembic_version"; - systemctl restart zuul-web - This fixed the zuul-web service. finally no error in systemctl status. ([[phab:T405119|T405119]])
* 21:38 ryankemper: [opensearch-k8s] [[phab:T414484|T414484]] Depooled eqiad; change verified working (now when I do `host k8s-ingress-dse-aa.discovery.wmnet` from `cumin1003`, and then reverse-lookup the resulting IP, I get a codfw address); so traffic is now routing to dse-k8s-codfw
* 21:35 ryankemper@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 21:30 Dreamy_Jazz: Created cusi_case, cusi_user, and cusi_signal on bnwiki, itwiki, simplewiki, plwiki for [[phab:T415529|T415529]]
* 21:27 ryankemper: [opensearch-k8s] [[phab:T414484|T414484]] Getting ready to depool `dnsdisc=k8s-ingress-dse-aa,name=eqiad`, leaving codfw pooled. This will get us ready for a full rolling-upgrade of the dse-k8s-eqiad cluster tomorrow.
* 21:23 Dreamy_Jazz: Evening UTC backport window done
* 21:08 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260797{{!}}SuggestedInvestigations: Import session into signal matching job (T421062)]] (duration: 10m 26s)
* 21:04 kharlan@deploy2002: kharlan: Continuing with sync
* 21:01 kharlan@deploy2002: kharlan: Backport for [[gerrit:1260797{{!}}SuggestedInvestigations: Import session into signal matching job (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:58 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1260797{{!}}SuggestedInvestigations: Import session into signal matching job (T421062)]]
* 20:51 eevans@cumin1003: END (ERROR) - Cookbook sre.cassandra.roll-reboot (exit_code=97) rolling reboot on P<nowiki>{</nowiki>sessionstore[1004-1006].eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 20:43 aaron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259183{{!}}Add Analytics APIs to the RestSandbox (T419429)]] (duration: 08m 33s)
* 20:38 aaron@deploy2002: aaron: Continuing with sync
* 20:36 aaron@deploy2002: aaron: Backport for [[gerrit:1259183{{!}}Add Analytics APIs to the RestSandbox (T419429)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:34 aaron@deploy2002: Started scap sync-world: Backport for [[gerrit:1259183{{!}}Add Analytics APIs to the RestSandbox (T419429)]]
* 20:30 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247073{{!}}Deploy temporary accounts to ruwiki (T413771)]] (duration: 11m 04s)
* 20:25 jdlrobson@deploy2002: stran, jdlrobson: Continuing with sync
* 20:21 jdlrobson@deploy2002: stran, jdlrobson: Backport for [[gerrit:1247073{{!}}Deploy temporary accounts to ruwiki (T413771)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:19 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1247073{{!}}Deploy temporary accounts to ruwiki (T413771)]]
* 20:17 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255775{{!}}Close the legacy-vector dblist (T421289)]] (duration: 07m 42s)
* 20:14 aokoth@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:14 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:13 aokoth@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:13 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:12 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 20:12 aokoth@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:12 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1255775{{!}}Close the legacy-vector dblist (T421289)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255775{{!}}Close the legacy-vector dblist (T421289)]]
* 20:05 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling reboot on P<nowiki>{</nowiki>hcaptcha-proxy7002.wikimedia.org<nowiki>}</nowiki> and A:hcaptcha-proxy
* 20:01 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling reboot on P<nowiki>{</nowiki>hcaptcha-proxy7002.wikimedia.org<nowiki>}</nowiki> and A:hcaptcha-proxy
* 20:00 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>sessionstore[1004-1006].eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:34 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2003.codfw.wmnet
* 19:30 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host pybal-test2003.codfw.wmnet
* 19:26 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 19:24 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>sessionstore[2004-2006].codfw.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
* 19:17 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
* 19:17 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 19:14 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned reboot
* 19:11 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 19:11 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 19:07 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 19:00 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2012.codfw.wmnet
* 18:57 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2012.codfw.wmnet
* 18:53 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>sessionstore[2004-2006].codfw.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 18:51 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
* 18:51 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
* 18:50 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 18:50 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 18:49 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
* 18:49 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
* 18:49 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
* 18:48 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/toolhub: apply
* 18:48 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/termbox: apply
* 18:47 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/termbox: apply
* 18:47 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 18:47 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 18:47 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 18:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 18:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 18:46 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: Planned reboot
* 18:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 18:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 18:45 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 18:45 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 18:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 18:44 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
* 18:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
* 18:44 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/recommendation-api: apply
* 18:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/recommendation-api: apply
* 18:43 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
* 18:43 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 18:43 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/push-notifications: apply
* 18:42 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/push-notifications: apply
* 18:42 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 18:41 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
* 18:41 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:40 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
* 18:40 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
* 18:39 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 18:39 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
* 18:37 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 18:37 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 18:37 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:37 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 18:35 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 18:34 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:34 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:33 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
* 18:29 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:28 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
* 18:28 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 18:26 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 18:26 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
* 18:26 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
* 18:25 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: debug java install
* 18:25 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:25 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases1003.eqiad.wmnet with reason: debug java install
* 18:25 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/image-suggestion: apply
* 18:24 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/image-suggestion: apply
* 18:23 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
* 18:23 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
* 18:23 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
* 18:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
* 18:22 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 18:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 18:21 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 18:21 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 18:20 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 18:20 mutante: releases1003 - apt-get upgrade - envoyproxy, python3-wmflib
* 18:20 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 18:20 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:19 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:19 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 18:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 18:18 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
* 18:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
* 18:17 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
* 18:17 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
* 18:16 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 18:16 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/echostore: apply
* 18:15 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
* 18:15 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
* 18:15 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 18:14 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 18:14 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
* 18:14 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/data-gateway: apply
* 18:14 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 18:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 18:13 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/commons-impact-analytics: apply
* 18:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/commons-impact-analytics: apply
* 18:12 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 18:12 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
* 18:11 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 18:11 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
* 18:11 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
* 18:11 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
* 18:09 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/apertium: apply
* 18:09 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/apertium: apply
* 17:29 sfaci@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:29 sfaci@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 17:23 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: debug java install
* 17:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 16:44 ebysans@deploy2002: Finished deploy [analytics/refinery@80c527b] (thin): Regular analytics weekly train THIN [analytics/refinery@80c527b6] (duration: 01m 59s)
* 16:42 ebysans@deploy2002: Started deploy [analytics/refinery@80c527b] (thin): Regular analytics weekly train THIN [analytics/refinery@80c527b6]
* 16:42 SandraEbele_: Deploying Refinery as part of weekly deployment train
* 16:41 ebysans@deploy2002: Finished deploy [analytics/refinery@80c527b]: Regular analytics weekly train [analytics/refinery@80c527b6] (duration: 04m 32s)
* 16:37 ebysans@deploy2002: Started deploy [analytics/refinery@80c527b]: Regular analytics weekly train [analytics/refinery@80c527b6]
* 16:22 ebysans@deploy2002: Finished deploy [analytics/refinery@80c527b] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@80c527b6] (duration: 01m 58s)
* 16:22 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:21 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:21 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:20 ebysans@deploy2002: Started deploy [analytics/refinery@80c527b] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@80c527b6]
* 16:20 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 16:19 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 16:18 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:18 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:06 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 16:05 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 16:05 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 16:04 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 16:03 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:02 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 16:02 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 16:01 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:51 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:50 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:42 blake@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244628{{!}}debug: reorder debug backends for eqiad switchover (T413974)]] (duration: 07m 41s)
* 15:41 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:37 blake@deploy2002: blake: Continuing with sync
* 15:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:37 blake@deploy2002: blake: Backport for [[gerrit:1244628{{!}}debug: reorder debug backends for eqiad switchover (T413974)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:34 blake@deploy2002: Started scap sync-world: Backport for [[gerrit:1244628{{!}}debug: reorder debug backends for eqiad switchover (T413974)]]
* 15:34 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:32 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-unlock-scap (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:32 root@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Datacenter switchover from codfw to eqiad - (duration: 91m 45s)
* 15:32 root@deploy2002: Forcefully removing global lock: Datacenter switchover from codfw to eqiad -
* 15:32 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-unlock-scap for datacenter switchover from codfw to eqiad
* 15:31 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:26 blake@dns1004: END - running authdns-update
* 15:24 blake@dns1004: START - running authdns-update
* 15:24 elukey@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 15:23 elukey@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 15:18 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters for datacenter switchover from codfw to eqiad
* 15:18 blake@dns1004: END - running authdns-update
* 15:16 blake@dns1004: START - running authdns-update
* 15:14 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-restore-ttl (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:13 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-restore-ttl for datacenter switchover from codfw to eqiad
* 15:11 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:10 root@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 15:09 root@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 15:08 root@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 15:07 root@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 15:07 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance for datacenter switchover from codfw to eqiad
* 15:07 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:07 root@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: sync
* 15:07 root@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: sync
* 15:07 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner for datacenter switchover from codfw to eqiad
* 15:02 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:02 blake@cumin1003: MediaWiki read-only period ends at: 2026-03-25 15:02:52.921926
* 14:55 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:53 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance for datacenter switchover from codfw to eqiad
* 14:52 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:52 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks for datacenter switchover from codfw to eqiad
* 14:51 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:46 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl for datacenter switchover from codfw to eqiad
* 14:28 cdanis: 💙cdanis@apt1002.wikimedia.org ~ 🕥☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
* 14:28 cdanis: 💙cdanis@apt1002.wikimedia.org ~ 🕥☕ sudo -i reprepro --component main --restrict cidergrinder update bullseye-wikimedia
* 14:24 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['phab2002']
* 14:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:17 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['phab2002']
* 14:14 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:13 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:13 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:12 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:11 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:08 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:07 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:07 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:06 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:06 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:06 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:05 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-lock-scap (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:00 root@deploy2002: Locking from deployment [ALL REPOSITORIES]: Datacenter switchover from codfw to eqiad -
* 14:00 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-lock-scap for datacenter switchover from codfw to eqiad
* 13:49 otto@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260091{{!}}EventStreamConfig - Increase spark_job_ingestion_scale for larger event streams (T360794 T351225)]] (duration: 07m 48s)
* 13:45 otto@deploy2002: otto: Continuing with sync
* 13:45 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:44 otto@deploy2002: otto: Backport for [[gerrit:1260091{{!}}EventStreamConfig - Increase spark_job_ingestion_scale for larger event streams (T360794 T351225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 otto@deploy2002: Started scap sync-world: Backport for [[gerrit:1260091{{!}}EventStreamConfig - Increase spark_job_ingestion_scale for larger event streams (T360794 T351225)]]
* 13:32 awight@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260614{{!}}[beta] Kill synthetic refs with feature flag (T421055)]], [[gerrit:1251193{{!}}idwiki: Remove unused user groups on Indonesian Wikipedia (T419105)]], [[gerrit:1251200{{!}}ptwiki: Enable block action for the abuse filter (T419312)]], [[gerrit:1256748{{!}}ptwiki: Add suppressredirect to autoreviewer and rollbacker user groups (T420704)]] (duration: 11m 33s)
* 13:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:27 awight@deploy2002: codenamenoreste, awight, gerrit-patch-uploader: Continuing with sync
* {{safesubst:SAL entry|1=13:23 awight@deploy2002: codenamenoreste, awight, gerrit-patch-uploader: Backport for [[gerrit:1260614{{!}}[beta] Kill synthetic refs with feature flag (T421055)]], [[gerrit:1251193{{!}}idwiki: Remove unused user groups on Indonesian Wikipedia (T419105)]], [[gerrit:1251200{{!}}ptwiki: Enable block action for the abuse filter (T419312)]], [[gerrit:1256748{{!}}ptwiki: Add suppressredirect to autoreviewer and rollbacker user groups (T420704)]}}
* 13:20 awight@deploy2002: Started scap sync-world: Backport for [[gerrit:1260614{{!}}[beta] Kill synthetic refs with feature flag (T421055)]], [[gerrit:1251193{{!}}idwiki: Remove unused user groups on Indonesian Wikipedia (T419105)]], [[gerrit:1251200{{!}}ptwiki: Enable block action for the abuse filter (T419312)]], [[gerrit:1256748{{!}}ptwiki: Add suppressredirect to autoreviewer and rollbacker user groups (T420704)]]
* 13:17 dcausse@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260045{{!}}Revert^2 "search: use the discovery ns record for the semanticsearch cluster"]] (duration: 10m 20s)
* 13:12 dcausse@deploy2002: dcausse: Continuing with sync
* 13:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:09 dcausse@deploy2002: dcausse: Backport for [[gerrit:1260045{{!}}Revert^2 "search: use the discovery ns record for the semanticsearch cluster"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:07 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:06 dcausse@deploy2002: Started scap sync-world: Backport for [[gerrit:1260045{{!}}Revert^2 "search: use the discovery ns record for the semanticsearch cluster"]]
* 13:06 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:02 XioNoX: Inter.Link - DDoS - Activation of automatic reroute
* 12:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:51 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 12:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Downgrade clouddb1022 to 10.11.15
* 12:41 marostegui@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 12:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1002.eqiad.wmnet
* 12:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-test-coord1002.eqiad.wmnet
* 12:38 mszwarc@deploy2002: mwscript-k8s job started: foreachwikiindblist all demoteIneligibleUsers.php --relay-log checkuser=metawiki --relay-log suppress=metawiki # [[phab:T418580|T418580]]
* 12:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-test-coord1002.eqiad.wmnet
* 12:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-mariadb1002.eqiad.wmnet
* 12:33 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:32 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1028.eqiad.wmnet
* 12:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host wdqs1028.eqiad.wmnet
* 12:24 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:19 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260617{{!}}Allow for demoting 2FA-less members of further 6 groups (T418580)]] (duration: 10m 23s)
* 12:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2009.codfw.wmnet
* 12:12 mszwarc@deploy2002: mszwarc: Continuing with sync
* 12:11 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1260617{{!}}Allow for demoting 2FA-less members of further 6 groups (T418580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:09 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1260617{{!}}Allow for demoting 2FA-less members of further 6 groups (T418580)]]
* 12:07 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host wdqs2009.codfw.wmnet
* 12:07 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl2002.codfw.wmnet
* 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl2002.codfw.wmnet
* 11:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl2001.codfw.wmnet
* 11:53 marostegui: Restart clouddb1022:s3 to enable error_log [[phab:T420177|T420177]]
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl2001.codfw.wmnet
* 11:51 jayme: migrated wikikube apiservers (eqiad and codfw) to IPIP - [[phab:T420436|T420436]]
* 11:49 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-master-codfw@codfw
* 11:49 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 11:48 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 11:46 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-master-eqiad@eqiad
* 11:46 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 11:45 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 11:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:43 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-master-codfw@codfw
* 11:41 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-master-eqiad@eqiad
* 11:40 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:38 mvernon@deploy2002: helmfile [codfw] DONE helmfile.d/services/kartotherian: apply
* 11:36 mvernon@deploy2002: helmfile [codfw] START helmfile.d/services/kartotherian: apply
* 11:21 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:18 mvernon@deploy2002: helmfile [codfw] DONE helmfile.d/services/kartotherian: apply
* 11:16 mvernon@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 11:15 mvernon@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 11:15 mvernon@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 11:14 mvernon@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 11:07 mvernon@deploy2002: helmfile [codfw] START helmfile.d/services/kartotherian: apply
* 11:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis abstractwiki in section s5
* 11:07 mvernon@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: apply
* 11:05 mvernon@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: apply
* 10:55 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:53 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis abstractwiki in section s5
* 10:45 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 10:33 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:27 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:26 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:21 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:20 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:20 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:01 elukey@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=kartotherian,name=codfw
* 09:58 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:57 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:52 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:52 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:51 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:51 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:46 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:45 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:45 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:44 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:05 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=aux-k8s-worker200[2-5].codfw.wmnet,cluster=aux-k8s,service=kubesvc
* 09:04 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=aux-k8s-worker200[6-9].codfw.wmnet,cluster=aux-k8s,service=kubesvc
* 09:04 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=aux-k8s-worker100[6-9].eqiad.wmnet,cluster=aux-k8s,service=kubesvc
* 08:55 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker200[6-9].eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:55 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker200[6-9].eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:35 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1009.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:35 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1008.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1007.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1006.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1009.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1008.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1007.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1006.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device fasw2-c8b-codfw
* 08:29 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device fasw2-c8b-codfw
* 08:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device fasw2-c8a-codfw
* 08:29 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device fasw2-c8a-codfw
* 08:10 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 00:33 rzl@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 00:23 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 00:23 rzl@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 00:23 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 00:22 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 00:22 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 00:22 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 00:21 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
* 00:21 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
* 00:21 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/toolhub: apply
* 00:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/termbox: apply
* 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/termbox: apply
* 00:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
* 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
* 00:19 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 00:19 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 00:18 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 00:18 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 00:18 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 00:18 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 00:17 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 00:17 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 00:17 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 00:17 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 00:17 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 00:16 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
* 00:16 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 00:16 rzl@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 00:16 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/recommendation-api: apply
* 00:15 rzl@deploy2002: helmfile [staging] START helmfile.d/services/recommendation-api: apply
* 00:15 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
* 00:15 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
* 00:15 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1023.eqiad.wmnet with OS bookworm
* 00:15 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/push-notifications: apply
* 00:15 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:15 rzl@deploy2002: helmfile [staging] START helmfile.d/services/push-notifications: apply
* 00:15 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
* 00:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:14 rzl@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
* 00:13 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
* 00:13 rzl@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
* 00:13 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 00:13 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 00:13 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 00:12 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 00:12 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 00:11 rzl@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
* 00:10 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 00:10 rzl@deploy2002: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 00:09 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
* 00:07 rzl@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
* 00:07 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 00:06 rzl@deploy2002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 00:06 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
* 00:06 rzl@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: apply
* 00:06 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
* 00:06 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1022.eqiad.wmnet with OS bookworm
* 00:06 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:06 rzl@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
* 00:06 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/image-suggestion: apply
* 00:06 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/image-suggestion: apply
* 00:05 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
* 00:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/geo-analytics: apply
* 00:05 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 00:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 00:04 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 00:04 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 00:04 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 00:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1021.eqiad.wmnet with OS bookworm
* 00:04 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:04 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 00:04 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 00:04 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:03 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 00:03 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 00:03 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 00:02 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 00:02 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 00:02 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
* 00:01 rzl@deploy2002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
* 00:01 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
* 00:01 rzl@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
* 00:00 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 00:00 rzl@deploy2002: helmfile [staging] START helmfile.d/services/echostore: apply
* 00:00 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
* 00:00 rzl@deploy2002: helmfile [staging] START helmfile.d/services/device-analytics: apply
== 2026-03-24 ==
* 23:59 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 23:59 rzl@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 23:59 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 23:59 rzl@deploy2002: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 23:58 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 23:58 rzl@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 23:58 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/commons-impact-analytics: apply
* 23:58 rzl@deploy2002: helmfile [staging] START helmfile.d/services/commons-impact-analytics: apply
* 23:56 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 23:56 rzl@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
* 23:56 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 23:56 rzl@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
* 23:54 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
* 23:53 rzl@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
* 23:53 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1023.eqiad.wmnet with reason: host reimage
* 23:53 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/apertium: apply
* 23:52 rzl@deploy2002: helmfile [staging] START helmfile.d/services/apertium: apply
* 23:46 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1022.eqiad.wmnet with reason: host reimage
* 23:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1021.eqiad.wmnet with reason: host reimage
* 23:19 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1023.eqiad.wmnet with OS bookworm
* 23:19 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1022.eqiad.wmnet with OS bookworm
* 23:18 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1021.eqiad.wmnet with OS bookworm
* 23:16 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
* 23:16 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
* 23:15 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
* 23:15 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
* 22:03 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260118{{!}}Drop inactive simple summary surveys (T389393)]] (duration: 08m 15s)
* 21:57 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:57 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1260118{{!}}Drop inactive simple summary surveys (T389393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1260118{{!}}Drop inactive simple summary surveys (T389393)]]
* 21:52 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259147{{!}}Address FIXME and drop not selector for section headings (T420085)]] (duration: 13m 11s)
* 21:47 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:44 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1259147{{!}}Address FIXME and drop not selector for section headings (T420085)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:38 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1259147{{!}}Address FIXME and drop not selector for section headings (T420085)]]
* 21:00 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=frwiki --source-pseudo-namespace=Abstract_ --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:55 jforrester@deploy2002: mwscript-k8s job started: moveBatch --wiki=frwiki '--u=Jdforrester (WMF)' --r=[[phab:T420654|T420654]] --noredirects /home/jforrester/T420654-frwiki-move # [[phab:T420654|T420654]] abstract: is now an interwiki; manual fix
* 20:55 jforrester@deploy2002: mwscript-k8s job started: moveBatch '--u=Jdforrester (WMF)' --r=[[phab:T420654|T420654]] --noredirects /home/jforrester/T420654-frwiki-move # [[phab:T420654|T420654]] abstract: is now an interwiki; manual fix
* 20:47 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=ptwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:46 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=idwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:46 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=frwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:45 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=eswiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:39 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=enwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:39 jforrester@deploy2002: mwscript-k8s job started: sql extensions/WikimediaMaintenance/maintenance/namespaceDupes.php --wiki=enwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:38 jforrester@deploy2002: mwscript-k8s job started: sql maintenance/namespaceDupes.php --wiki=enwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:38 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256433{{!}}[wikifunctions] Drop m.wikifunctions.org from lists, we've not used it for years]], [[gerrit:1250114{{!}}Move GrowthExperiments REST API definition to IS]], [[gerrit:1259993{{!}}dumpInterwiki: Re-generate to add Abstract Wikipedia (and others) (T420654)]] (duration: 07m 46s)
* 20:33 jforrester@deploy2002: jforrester: Continuing with sync
* 20:32 jforrester@deploy2002: jforrester: Backport for [[gerrit:1256433{{!}}[wikifunctions] Drop m.wikifunctions.org from lists, we've not used it for years]], [[gerrit:1250114{{!}}Move GrowthExperiments REST API definition to IS]], [[gerrit:1259993{{!}}dumpInterwiki: Re-generate to add Abstract Wikipedia (and others) (T420654)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified the
* 20:30 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1256433{{!}}[wikifunctions] Drop m.wikifunctions.org from lists, we've not used it for years]], [[gerrit:1250114{{!}}Move GrowthExperiments REST API definition to IS]], [[gerrit:1259993{{!}}dumpInterwiki: Re-generate to add Abstract Wikipedia (and others) (T420654)]]
* {{safesubst:SAL entry|1=20:27 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259967{{!}}Set json object before setting Abstract Wiki Id (T420916)]], [[gerrit:1259994{{!}}AbstractPreview: apply selected preview language lang/dir to abstract preview body (T420687)]], [[gerrit:1260092{{!}}AbstractTitle: Handle pageinfo responses without normalized titles (T420725)]], [[gerrit:1259992{{!}}[abstractwiki] Don't list abstract as a langlist entry}}
* 20:22 jforrester@deploy2002: jforrester: Continuing with sync
* 20:22 jforrester@deploy2002: jforrester: Backport for [[gerrit:1259967{{!}}Set json object before setting Abstract Wiki Id (T420916)]], [[gerrit:1259994{{!}}AbstractPreview: apply selected preview language lang/dir to abstract preview body (T420687)]], [[gerrit:1260092{{!}}AbstractTitle: Handle pageinfo responses without normalized titles (T420725)]], [[gerrit:1259992{{!}}[abstractwiki] Don't list abstract as a langlist entry (T420654)]] s
* {{safesubst:SAL entry|1=20:20 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1259967{{!}}Set json object before setting Abstract Wiki Id (T420916)]], [[gerrit:1259994{{!}}AbstractPreview: apply selected preview language lang/dir to abstract preview body (T420687)]], [[gerrit:1260092{{!}}AbstractTitle: Handle pageinfo responses without normalized titles (T420725)]], [[gerrit:1259992{{!}}[abstractwiki] Don't list abstract as a langlist entry}}
* 20:12 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1242542{{!}}Generate our own logo thumbnails rather than using MediaWiki's (T414048)]], [[gerrit:1250095{{!}}Enwikinews: Only enable flaggedRevs in article namespace (T418066)]], [[gerrit:1252684{{!}}Disable magic links on afwiki (T420142)]] (duration: 09m 22s)
* 20:08 jforrester@deploy2002: jforrester, pppery: Continuing with sync
* 20:05 jforrester@deploy2002: jforrester, pppery: Backport for [[gerrit:1242542{{!}}Generate our own logo thumbnails rather than using MediaWiki's (T414048)]], [[gerrit:1250095{{!}}Enwikinews: Only enable flaggedRevs in article namespace (T418066)]], [[gerrit:1252684{{!}}Disable magic links on afwiki (T420142)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:03 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1242542{{!}}Generate our own logo thumbnails rather than using MediaWiki's (T414048)]], [[gerrit:1250095{{!}}Enwikinews: Only enable flaggedRevs in article namespace (T418066)]], [[gerrit:1252684{{!}}Disable magic links on afwiki (T420142)]]
* 19:42 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:42 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:41 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:39 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1258300{{!}}tests: Make many things static for PHPUnit 10 (T420844)]], [[gerrit:1258301{{!}}phpunit.xml: Update configuration for PHPUnit 10 (T420844)]] (duration: 07m 21s)
* 19:35 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:35 reedy@deploy2002: reedy: Continuing with sync
* 19:34 reedy@deploy2002: reedy: Backport for [[gerrit:1258300{{!}}tests: Make many things static for PHPUnit 10 (T420844)]], [[gerrit:1258301{{!}}phpunit.xml: Update configuration for PHPUnit 10 (T420844)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:32 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1258300{{!}}tests: Make many things static for PHPUnit 10 (T420844)]], [[gerrit:1258301{{!}}phpunit.xml: Update configuration for PHPUnit 10 (T420844)]]
* 19:02 inflatador: bking@apt1002 `sudo -E reprepro -C component/opensearch2 include trixie-wikimedia ~/wmf-opensearch-search-plugins-2.19.5+3-trixie/wmf-opensearch-search-plugins_2.19.5+3_amd64.changes`
* 18:48 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 18:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1170: Degraded drive replaced [[phab:T420873|T420873]]
* 18:43 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 18:36 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 18:35 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 18:25 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:24 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:20 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 18:20 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 18:13 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 18:11 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 18:07 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on phab2002.codfw.wmnet with reason: [[phab:T420228|T420228]]
* 18:01 aokoth@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 18:01 aokoth@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 18:01 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:00 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:00 mutante: codesearch9.codesearch - systemctl restart hound_proxy ([[phab:T421147|T421147]])
* 17:34 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 17:30 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 17:20 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 17:20 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 17:20 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 17:20 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 17:00 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:00 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:00 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:00 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 16:47 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 16:38 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1113.*
* 16:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1170: Degraded drive replaced [[phab:T420873|T420873]]
* 16:24 sfaci@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:24 sfaci@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:22 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1113.eqiad.wmnet with OS trixie
* 16:05 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 16:04 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 16:03 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 16:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 16:03 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 16:03 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 16:03 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 16:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1113.eqiad.wmnet with reason: host reimage
* 15:54 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1113.eqiad.wmnet with reason: host reimage
* 15:54 bjensen: Services portion of the datacenter switchover is complete
* 15:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2009.codfw.wmnet with OS trixie
* 15:46 blake@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) depool all services in codfw: Datacenter Switchover - [[phab:T413974|T413974]]
* 15:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:38 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1113.eqiad.wmnet with OS trixie
* 15:38 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1113.eqiad.wmnet with OS trixie
* 15:36 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2009.codfw.wmnet with reason: host reimage
* 15:30 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2009.codfw.wmnet with reason: host reimage
* 15:20 blake@cumin1003: START - Cookbook sre.discovery.datacenter depool all services in codfw: Datacenter Switchover - [[phab:T413974|T413974]]
* 15:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1113.eqiad.wmnet with OS trixie
* 15:18 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2009.codfw.wmnet with OS trixie
* 15:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2008.codfw.wmnet with OS trixie
* 14:59 blake@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool codfw [reason: no reason specified, no task ID specified]
* 14:59 blake@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool codfw [reason: no reason specified, no task ID specified]
* 14:59 bjensen: beginning the Traffic and Services portions of the DC switchover, operational followup will be in #wikimedia-sre
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2008.codfw.wmnet with reason: host reimage
* 14:56 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2008.codfw.wmnet with reason: host reimage
* 14:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1009.eqiad.wmnet with OS trixie
* 14:44 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2008.codfw.wmnet with OS trixie
* 14:42 aokoth@dns1004: END - running authdns-update
* 14:41 aokoth@dns1004: START - running authdns-update
* 14:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1009.eqiad.wmnet with reason: host reimage
* 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2009.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:27 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1009.eqiad.wmnet with reason: host reimage
* 14:26 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2009.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:25 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:23 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:20 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2008.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:19 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:19 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:16 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1009.eqiad.wmnet with OS trixie
* 14:15 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2008.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:14 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 14:13 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 14:13 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 14:13 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 14:12 dcausse@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259979{{!}}Revert "search: use the discovery ns record for the semanticsearch cluster"]] (duration: 06m 54s)
* 14:08 dcausse@deploy2002: dcausse: Continuing with sync
* 14:07 dcausse@deploy2002: dcausse: Backport for [[gerrit:1259979{{!}}Revert "search: use the discovery ns record for the semanticsearch cluster"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2007.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:05 dcausse@deploy2002: Started scap sync-world: Backport for [[gerrit:1259979{{!}}Revert "search: use the discovery ns record for the semanticsearch cluster"]]
* 14:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1008.eqiad.wmnet with OS trixie
* 14:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2007.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:59 jforrester@deploy2002: mwscript-k8s job started: sql --wiki=abstractwiki /srv/mediawiki/php-1.46.0-wmf.20/extensions/Translate/sql/mysql/translate_message_group_subscriptions.sql # [[phab:T420656|T420656]] translate_message_group_subscriptions
* 13:59 dcausse@deploy2002: Sync cancelled.
* 13:57 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:52 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1008.eqiad.wmnet with reason: host reimage
* 13:46 dcausse@deploy2002: dcausse: Backport for [[gerrit:1259875{{!}}search: use the discovery ns record for the semanticsearch cluster (T414484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:44 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1008.eqiad.wmnet with reason: host reimage
* 13:44 dcausse@deploy2002: Started scap sync-world: Backport for [[gerrit:1259875{{!}}search: use the discovery ns record for the semanticsearch cluster (T414484)]]
* 13:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1008.eqiad.wmnet with OS trixie
* 13:32 sukhe: sudo cumin -b1 -s20 'C:bird' "run-puppet-agent --enable 'merging CR {{Gerrit|1248385}}, [[phab:T413740|T413740]]'"
* 13:30 cmelo@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259231{{!}}Enable the CampaignEvents extension on all wikibooks (T419597)]], [[gerrit:1259237{{!}}Enable $wgCampaignEventsEnableEventGoals in prod wikis (T414149)]] (duration: 12m 43s)
* 13:26 cmelo@deploy2002: cmelo, daimona: Continuing with sync
* 13:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1007.eqiad.wmnet with OS trixie
* 13:23 sukhe: sudo cumin 'C:bird' "disable-puppet 'merging CR {{Gerrit|1248385}}, [[phab:T413740|T413740]]'"
* 13:20 cmelo@deploy2002: cmelo, daimona: Backport for [[gerrit:1259231{{!}}Enable the CampaignEvents extension on all wikibooks (T419597)]], [[gerrit:1259237{{!}}Enable $wgCampaignEventsEnableEventGoals in prod wikis (T414149)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:18 cmelo@deploy2002: Started scap sync-world: Backport for [[gerrit:1259231{{!}}Enable the CampaignEvents extension on all wikibooks (T419597)]], [[gerrit:1259237{{!}}Enable $wgCampaignEventsEnableEventGoals in prod wikis (T414149)]]
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 13:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) payments1012.frack.eqiad.wmnet on all recursors
* 13:04 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache payments1012.frack.eqiad.wmnet on all recursors
* 13:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) payments1011.frack.eqiad.wmnet on all recursors
* 13:03 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache payments1011.frack.eqiad.wmnet on all recursors
* 13:03 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) payments1010.frack.eqiad.wmnet on all recursors
* 13:03 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache payments1010.frack.eqiad.wmnet on all recursors
* 13:02 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 13:00 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:00 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify records for payments servers frack - cmooney@cumin1003"
* 13:00 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify records for payments servers frack - cmooney@cumin1003"
* 12:56 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 12:50 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1007.eqiad.wmnet with OS trixie
* 12:02 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1017.eqiad.wmnet
* 12:02 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1017.eqiad.wmnet
* 12:01 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s1
* 11:53 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:53 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:51 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017 [[phab:T419960|T419960]]
* 11:51 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1017.eqiad.wmnet
* 11:51 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1017.eqiad.wmnet
* 11:51 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s1
* 11:49 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:49 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:46 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:46 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 11:36 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin1001.eqiad.wmnet
* 11:32 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet,service=x3
* 11:32 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet,service=x3
* 11:32 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcumin1001.eqiad.wmnet
* 11:31 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1023.eqiad.wmnet,service=x3
* 11:31 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=x3
* 11:31 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1023.eqiad.wmnet,service=x3
* 11:31 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1022.eqiad.wmnet,service=x3
* 11:27 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s3
* 11:27 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:27 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:26 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1006.eqiad.wmnet with reason: host reimage
* 11:22 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:19 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:19 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:18 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1006.eqiad.wmnet with reason: host reimage
* 11:18 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s3
* 11:17 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s3
* 11:14 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:14 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:07 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 10:55 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:55 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:53 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2007.codfw.wmnet with OS trixie
* 10:49 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2006.codfw.wmnet with OS trixie
* 10:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:36 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2007.codfw.wmnet with reason: host reimage
* 10:33 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2006.codfw.wmnet with reason: host reimage
* 10:30 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2007.codfw.wmnet with reason: host reimage
* 10:28 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2006.codfw.wmnet with reason: host reimage
* 10:22 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:21 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:20 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:20 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:17 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 10:17 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2007.codfw.wmnet with OS trixie
* 10:16 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2006.codfw.wmnet with OS trixie
* 10:07 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 09:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:46 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:43 ayounsi@cumin1003: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:34 ayounsi@cumin1003: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:31 ayounsi@cumin1003: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:31 ayounsi@cumin1003: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:29 ayounsi@cumin1003: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:29 ayounsi@cumin1003: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:23 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4008.ulsfo.wmnet with OS bookworm
* 09:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 09:01 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 08:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 08:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old ulsfo ganeti VIP - ayounsi@cumin1003"
* 08:50 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old ulsfo ganeti VIP - ayounsi@cumin1003"
* 08:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:46 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1170: Degraded drive [[phab:T420873|T420873]]
* 08:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:45 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1170: Degraded drive [[phab:T420873|T420873]]
* 08:43 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:39 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti4008.ulsfo.wmnet with OS bookworm
* 08:31 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 08:31 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:27 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:27 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:25 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:24 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:13 hashar@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 07:59 hashar: Changed https://logstash.wikimedia.org/ default page back to /app/dashboards
* 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.18 (duration: 01m 13s)
* 03:42 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.21 refs [[phab:T420479|T420479]] (duration: 39m 27s)
* 03:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 02:46 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 04s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:56 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 01:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1104.*
* 01:37 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1104.eqiad.wmnet with OS trixie
* 01:14 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1104.eqiad.wmnet with reason: host reimage
* 01:08 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1104.eqiad.wmnet with reason: host reimage
* 00:52 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1104.eqiad.wmnet with OS trixie
* 00:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 00:18 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1115.eqiad.wmnet with OS trixie
== 2026-03-23 ==
* 22:51 rzl: root@apt1002:~# reprepro --noskipold --restrict vopsbot update bookworm-wikimedia
* 22:44 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 22:28 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host an-worker1172.eqiad.wmnet
* 22:25 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1104.eqiad.wmnet with OS trixie
* 22:07 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 22:05 maryum: Deployed security fix for [[phab:T415584|T415584]]
* 21:53 maryum: Deployed security fix for [[phab:T419192|T419192]]
* 21:41 maryum: Deployed security fix for [[phab:T419168|T419168]]
* 21:35 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 21:25 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255847{{!}}testwiki: Add temporary groups for security testing]] (duration: 12m 33s)
* 21:22 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 21:21 catrope@deploy2002: catrope: Continuing with sync
* 21:18 catrope@deploy2002: catrope: Backport for [[gerrit:1255847{{!}}testwiki: Add temporary groups for security testing]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:12 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1255847{{!}}testwiki: Add temporary groups for security testing]]
* 21:05 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1106.eqiad.wmnet [reason: trixie reimaging]
* 21:05 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1106.eqiad.wmnet [reason: trixie reimaging]
* 21:05 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1104.eqiad.wmnet with OS trixie
* 21:04 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1104.eqiad.wmnet [reason: trixie reimaging]
* 21:03 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1103.eqiad.wmnet [reason: trixie reimaging]
* 20:58 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256394{{!}}Abstract Wikipedia: Fix API call to get page info (T420725)]], [[gerrit:1259085{{!}}[abstractwiki] Enable the Translate extension (T420656)]], [[gerrit:1250113{{!}}Move testwiki-only Attribution REST API definition to IS]] (duration: 11m 12s)
* 20:54 jforrester@deploy2002: jforrester: Continuing with sync
* 20:53 jforrester@deploy2002: jforrester: Backport for [[gerrit:1256394{{!}}Abstract Wikipedia: Fix API call to get page info (T420725)]], [[gerrit:1259085{{!}}[abstractwiki] Enable the Translate extension (T420656)]], [[gerrit:1250113{{!}}Move testwiki-only Attribution REST API definition to IS]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1103.eqiad.wmnet with OS trixie
* 20:50 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy4002.wikimedia.org
* 20:50 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:50 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:47 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1256394{{!}}Abstract Wikipedia: Fix API call to get page info (T420725)]], [[gerrit:1259085{{!}}[abstractwiki] Enable the Translate extension (T420656)]], [[gerrit:1250113{{!}}Move testwiki-only Attribution REST API definition to IS]]
* 20:46 sukhe@cumin1003: START - Cookbook sre.dns.netbox
* 20:45 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 20:43 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1102.eqiad.wmnet [reason: trixie reimaging]
* {{safesubst:SAL entry|1=20:42 dani@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254448{{!}}Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450{{!}}Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452{{!}}Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763{{!}}testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120{{!}}Enable wgCampaignEventsEnableEventGoals in}}
* 20:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1102.eqiad.wmnet with OS trixie
* 20:41 sukhe@cumin1003: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy4002.wikimedia.org
* 20:40 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy4001.wikimedia.org
* 20:40 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:40 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:39 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:37 dani@deploy2002: milimetric, daimona, dani: Continuing with sync
* {{safesubst:SAL entry|1=20:36 dani@deploy2002: milimetric, daimona, dani: Backport for [[gerrit:1254448{{!}}Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450{{!}}Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452{{!}}Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763{{!}}testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120{{!}}Enable wgCampaignEventsEnableEventGoals i}}
* 20:35 sukhe@cumin1003: START - Cookbook sre.dns.netbox
* {{safesubst:SAL entry|1=20:34 dani@deploy2002: Started scap sync-world: Backport for [[gerrit:1254448{{!}}Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450{{!}}Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452{{!}}Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763{{!}}testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120{{!}}Enable wgCampaignEventsEnableEventGoals in}}
* 20:31 sukhe@cumin1003: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy4001.wikimedia.org
* 20:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1103.eqiad.wmnet with reason: host reimage
* 20:24 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1103.eqiad.wmnet with reason: host reimage
* 20:23 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 20:19 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1102.eqiad.wmnet with reason: host reimage
* 20:17 alexsanford@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256472{{!}}Reduce reauth timeout for editing site JS to 10 minutes (T419605)]] (duration: 07m 32s)
* 20:14 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1102.eqiad.wmnet with reason: host reimage
* 20:13 alexsanford@deploy2002: alexsanford: Continuing with sync
* 20:11 alexsanford@deploy2002: alexsanford: Backport for [[gerrit:1256472{{!}}Reduce reauth timeout for editing site JS to 10 minutes (T419605)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 alexsanford@deploy2002: Started scap sync-world: Backport for [[gerrit:1256472{{!}}Reduce reauth timeout for editing site JS to 10 minutes (T419605)]]
* 20:08 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1103.eqiad.wmnet with OS trixie
* 20:07 alexsanford: Deployed mitigation for [[phab:T419605|T419605]]
* 19:58 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 19:58 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 19:58 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 19:58 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1102.eqiad.wmnet with OS trixie
* 19:57 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 19:54 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 19:54 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 19:54 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 19:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy4004.wikimedia.org
* 19:51 cdobbins@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1102.eqiad.wmnet with OS trixie
* 19:50 cdobbins@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1103.eqiad.wmnet with OS trixie
* 19:50 ayounsi@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy4004.wikimedia.org
* 19:47 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 19:47 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy4003.wikimedia.org
* 19:46 ayounsi@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy4003.wikimedia.org
* 19:44 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 19:44 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>aqs[1011,1014,1016-1022]*<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:42 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1103.eqiad.wmnet with OS trixie
* 19:42 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1103.eqiad.wmnet [reason: trixie reimaging]
* 19:41 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1101.eqiad.wmnet [reason: trixie reimaging]
* 19:41 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1102.eqiad.wmnet with OS trixie
* 19:41 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1102.eqiad.wmnet [reason: trixie reimaging]
* 19:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1101.eqiad.wmnet with OS trixie
* 19:39 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1100.eqiad.wmnet [reason: trixie reimaging]
* 19:37 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1100.eqiad.wmnet with OS trixie
* 19:30 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 19:18 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1101.eqiad.wmnet with reason: host reimage
* 19:14 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1100.eqiad.wmnet with reason: host reimage
* 19:13 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1101.eqiad.wmnet with reason: host reimage
* 19:13 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:13 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:10 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1100.eqiad.wmnet with reason: host reimage
* 18:59 inflatador: bking@deploy2002 restarting opensearch-semantic-search eqiad to renew certs
* 18:57 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1101.eqiad.wmnet with OS trixie
* 18:55 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1101.eqiad.wmnet with OS trixie
* 18:54 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1100.eqiad.wmnet with OS trixie
* 18:53 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1100.eqiad.wmnet with OS trixie
* 18:50 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 18:49 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 18:36 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on hcaptcha-proxy4002.wikimedia.org with reason: depooled host (soon to be decomed)
* 18:35 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on hcaptcha-proxy4001.wikimedia.org with reason: depooled host (soon to be decomed)
* 18:10 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 18:10 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 18:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>aqs[1011,1014,1016-1022]*<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 17:54 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 17:54 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1115.eqiad.wmnet with OS trixie
* 17:53 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase-eqiad
* 17:49 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259136{{!}}EventStreamConfig: Document not adding performer attributes to SI interaction v2 stream (T418740)]] (duration: 06m 28s)
* 17:45 dreamyjazz@deploy2002: kharlan, dreamyjazz: Continuing with sync
* 17:45 dreamyjazz@deploy2002: kharlan, dreamyjazz: Backport for [[gerrit:1259136{{!}}EventStreamConfig: Document not adding performer attributes to SI interaction v2 stream (T418740)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:43 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1259136{{!}}EventStreamConfig: Document not adding performer attributes to SI interaction v2 stream (T418740)]]
* 17:35 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:34 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1101.eqiad.wmnet with OS trixie
* 17:34 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1101.eqiad.wmnet [reason: trixie reimaging]
* 17:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:31 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1100.eqiad.wmnet with OS trixie
* 17:30 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1100.eqiad.wmnet [reason: trixie reimaging]
* 17:27 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:26 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:24 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:23 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:22 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:21 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 17:21 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:20 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 17:20 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:18 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:17 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:17 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:16 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 17:13 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:13 bd808@deploy2002: Finished deploy [releng/jenkins-deploy@f47af21] (releasing): jobs: Use TZ=UTC in branchMWSingleVersion.groovy trigger ([[phab:T404399|T404399]]) (duration: 01m 36s)
* 17:12 bd808@deploy2002: Started deploy [releng/jenkins-deploy@f47af21] (releasing): jobs: Use TZ=UTC in branchMWSingleVersion.groovy trigger ([[phab:T404399|T404399]])
* 17:12 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:09 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:08 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:08 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:08 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:07 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:06 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:04 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:04 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:03 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:02 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 17:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1001.eqiad.wmnet
* 16:56 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:56 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 14 hosts
* 16:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:55 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for 14 hosts
* 16:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1001.eqiad.wmnet
* 16:53 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:52 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:52 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:46 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:41 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:38 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin2001.codfw.wmnet
* 16:35 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 16:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:34 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin2001.codfw.wmnet
* 16:32 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 16:31 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:30 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:29 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1023.eqiad.wmnet
* 16:29 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1023.eqiad.wmnet
* 16:28 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:27 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:24 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1010.eqiad.wmnet
* 16:24 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1010.eqiad.wmnet
* 16:21 jgreen@dns1004: END - running authdns-update
* 16:19 jgreen@dns1004: START - running authdns-update
* 16:18 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:17 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Rebooting clouddb1023 [[phab:T419960|T419960]]
* 16:09 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1025.eqiad.wmnet
* 16:09 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1025.eqiad.wmnet
* 16:09 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1025.eqiad.wmnet
* 16:04 urandom: stopping aqs1010 for SSD replacement — [[phab:T420867|T420867]]
* 16:03 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:03 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on aqs1010.eqiad.wmnet with reason: Shutting down for SSD replacement — [[phab:T420867|T420867]]
* 15:58 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1025.eqiad.wmnet
* 15:57 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1025.eqiad.wmnet with reason: Rebooting clouddb1025 [[phab:T419960|T419960]]
* 15:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:53 topranks: disabling puppet for nftables-enabled machines to validate new ruleset on selected hosts before wider rollout [[phab:T420715|T420715]]
* 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:50 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:49 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:31 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1172.eqiad.wmnet
* 15:21 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 15:20 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 15:15 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1020.eqiad.wmnet
* 15:14 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1020.eqiad.wmnet
* 15:14 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet
* 15:05 btullis@cumin1003: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1172.eqiad.wmnet
* 15:03 btullis@cumin1003: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1172.eqiad.wmnet
* 15:03 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) opensearch-ipoid.discovery.wmnet on all recursors
* 15:03 bking@cumin2002: START - Cookbook sre.dns.wipe-cache opensearch-ipoid.discovery.wmnet on all recursors
* 15:03 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 15:01 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 15:01 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) opensearch-ipoid.discovery.wmnet on all recursors
* 14:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache opensearch-ipoid.discovery.wmnet on all recursors
* 14:58 sukhe@dns1004: END - running authdns-update
* 14:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) opensearch-test.discovery.wmnet on all recursors
* 14:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache opensearch-test.discovery.wmnet on all recursors
* 14:57 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet
* 14:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:56 sukhe@dns1004: START - running authdns-update
* 14:56 sukhe@dns1004: END - running authdns-update
* 14:56 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1020.eqiad.wmnet with reason: Rebooting clouddb1020 [[phab:T419960|T419960]]
* 14:56 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1019.eqiad.wmnet
* 14:56 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1019.eqiad.wmnet
* 14:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:55 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1019.eqiad.wmnet
* 14:55 sukhe@dns1004: START - running authdns-update
* 14:55 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase-eqiad
* 14:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:49 sukhe@dns1004: END - running authdns-update
* 14:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:48 sukhe@dns1004: START - running authdns-update
* 14:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.eqiad.wmnet
* 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 14:44 sukhe@dns1004: END - running authdns-update
* 14:43 sukhe@dns1004: START - running authdns-update
* 14:40 sukhe@dns1004: FAIL - running authdns-update
* 14:39 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.eqiad.wmnet
* 14:38 sukhe@dns1004: START - running authdns-update
* 14:37 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 14:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) k8s-ingress-dse-aa.discovery.wmnet on all recursors
* 14:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache k8s-ingress-dse-aa.discovery.wmnet on all recursors
* 14:34 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1019.eqiad.wmnet
* 14:34 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1019.eqiad.wmnet with reason: Rebooting clouddb1019 [[phab:T419960|T419960]]
* 14:33 sukhe@dns1004: FAIL - running authdns-update
* 14:33 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1002.eqiad.wmnet
* 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1018.eqiad.wmnet
* 14:33 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1018.eqiad.wmnet
* 14:32 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet
* 14:32 sukhe@dns1004: START - running authdns-update
* 14:31 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=codfw
* 14:30 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1172.eqiad.wmnet with OS bullseye
* 14:30 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:27 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1002.eqiad.wmnet
* 14:22 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1018.eqiad.wmnet with reason: Rebooting clouddb1018 [[phab:T419960|T419960]]
* 14:22 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1018.eqiad.wmnet
* 14:22 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1018.eqiad.wmnet
* 14:21 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet
* 14:20 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:17 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1001.eqiad.wmnet
* 14:14 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 14:14 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on db1253.eqiad.wmnet with reason: Under repair
* 14:11 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1001.eqiad.wmnet
* 14:07 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:04 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha2002.wikimedia.org
* 14:04 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:03 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2347-2351].codfw.wmnet
* 14:03 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2347-2351].codfw.wmnet
* 14:00 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha2002.wikimedia.org
* 14:00 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha2001.wikimedia.org
* 13:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1172.eqiad.wmnet with reason: host reimage
* 13:57 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2347-2351].codfw.wmnet
* 13:56 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha2001.wikimedia.org
* 13:56 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:55 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha1002.wikimedia.org
* 13:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1172.eqiad.wmnet with reason: host reimage
* 13:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2347-2351].codfw.wmnet
* 13:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:51 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha1002.wikimedia.org
* 13:51 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha1001.wikimedia.org
* 13:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:47 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha1001.wikimedia.org
* 13:47 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:43 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:43 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:43 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:42 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host an-worker1172.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:42 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 13:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2012.codfw.wmnet
* 13:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2011.codfw.wmnet
* 13:38 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb1002.eqiad.wmnet
* 13:36 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2012.codfw.wmnet
* 13:36 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2011.codfw.wmnet
* 13:36 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:30 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2003.codfw.wmnet
* 13:30 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1002.eqiad.wmnet
* 13:29 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudlb1002.eqiad.wmnet
* 13:29 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1002.eqiad.wmnet
* 13:28 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb1001.eqiad.wmnet
* 13:25 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:24 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2003.codfw.wmnet
* 13:21 jforrester@deploy2002: mwscript-k8s job started: extensions/WikimediaMaintenance/maintenance/createExtensionTables.php --wiki=abstractwiki translate # [[phab:T420656|T420656]]
* 13:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 13:20 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1001.eqiad.wmnet
* 13:20 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudlb1001.eqiad.wmnet
* 13:20 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1001.eqiad.wmnet
* 13:19 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:19 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudlb1001.eqiad.wmnet
* 13:18 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1001.eqiad.wmnet
* 13:17 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259035{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added (T420722)]], [[gerrit:1259036{{!}}tests: add coverage for WelcomeSurveyHooks::onCentralAuthPostLoginRedirect (T420722)]], [[gerrit:1259046{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added 2 (T420722)]] (duration: 11m 43s)
* 13:16 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:14 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:13 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:11 sgimeno@deploy2002: sgimeno: Continuing with sync
* 13:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2005-2006,2011-2018,2033-2039,2041-2042,2044,2046,2049-2051,2055-2062,2064-2065,2067-2078,2087-2095,2102-2115,2124-2179,2184-2199].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 13:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2186-2199].codfw.wmnet
* 13:08 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2186-2199].codfw.wmnet
* 13:07 sgimeno@deploy2002: sgimeno: Backport for [[gerrit:1259035{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added (T420722)]], [[gerrit:1259036{{!}}tests: add coverage for WelcomeSurveyHooks::onCentralAuthPostLoginRedirect (T420722)]], [[gerrit:1259046{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added 2 (T420722)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Ch
* 13:05 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1259035{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added (T420722)]], [[gerrit:1259036{{!}}tests: add coverage for WelcomeSurveyHooks::onCentralAuthPostLoginRedirect (T420722)]], [[gerrit:1259046{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added 2 (T420722)]]
* 12:43 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "bast4006 - ayounsi@cumin1003"
* 12:42 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "bast4006 - ayounsi@cumin1003"
* 12:42 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2006.codfw.wmnet with OS bookworm
* 12:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host bast4006.wikimedia.org
* 12:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast4006.wikimedia.org with OS bookworm
* 12:34 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:28 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
* 12:22 cmooney@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
* 12:18 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 12:16 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2186-2199].codfw.wmnet
* 12:14 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 12:08 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS bookworm
* 12:07 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2186-2199].codfw.wmnet
* 12:04 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 12:04 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 11:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 11:44 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 11:40 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:40 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:23 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2140-2153].codfw.wmnet
* 11:20 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS bookworm
* 11:20 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2140-2153].codfw.wmnet
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:19 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) bast4006.wikimedia.org on all recursors
* 11:19 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache bast4006.wikimedia.org on all recursors
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:19 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:15 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 11:15 ayounsi@cumin1003: START - Cookbook sre.ganeti.makevm for new host bast4006.wikimedia.org
* 11:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts install4003.wikimedia.org
* 11:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install4003.wikimedia.org decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
* 11:08 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install4003.wikimedia.org decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
* 11:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2140-2153].codfw.wmnet
* 11:05 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 11:00 ayounsi@cumin1003: START - Cookbook sre.hosts.decommission for hosts install4003.wikimedia.org
* 10:57 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:55 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2140-2153].codfw.wmnet
* 10:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:55 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:44 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:43 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:38 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 10:38 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:38 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:38 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 10:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:28 topranks: disable puppet on routed-ganeti hosts to test nftables update on specific nodes [[phab:T420715|T420715]]
* 10:27 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s1
* 10:25 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s1
* 10:25 ayounsi@dns1004: END - running authdns-update
* 10:24 ayounsi@dns1004: START - running authdns-update
* 10:23 btullis@cumin1003: START - Cookbook sre.hosts.provision for host an-worker1172.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 10:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:20 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s4
* 10:18 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s4
* 10:13 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s8
* 10:11 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s8
* 10:09 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1172.eqiad.wmnet with OS bullseye
* 10:08 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:05 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s7
* 10:05 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 10:04 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 10:04 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s7
* 09:58 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s3
* 09:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:57 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s3
* 09:53 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:53 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 09:52 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s2
* 09:49 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s2
* 09:49 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 09:49 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 09:49 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 09:49 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 09:44 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 09:44 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s5
* 09:44 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 09:44 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 09:42 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s5
* 09:40 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:39 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:33 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s6
* 09:32 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s6
* 09:29 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:29 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:25 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 09:25 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 09:24 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section es7
* 09:23 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section es7
* 09:22 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 09:22 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section es6
* 09:16 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section es6
* 09:11 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:11 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:10 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section x3
* 09:09 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section x3
* 09:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 09:05 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:02 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section x1
* 09:01 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section x1
* 09:00 federico3: starting [[phab:T416706|T416706]]
* 09:00 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 08:59 fceratto@cumin1003: END (FAIL) - Cookbook sre.switchdc.databases.prepare (exit_code=99) for the switch from eqiad to codfw for section test-s4
* 08:59 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from eqiad to codfw for section test-s4
* 08:59 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:59 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:50 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:46 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255736{{!}}hcaptcha: Use the global edit key for MobileFrontend edits if present (T420574)]] (duration: 14m 42s)
* 08:44 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 08:43 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 08:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:40 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:40 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:40 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:39 kharlan@deploy2002: kharlan: Continuing with sync
* 08:38 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:37 kharlan@deploy2002: kharlan: Backport for [[gerrit:1255736{{!}}hcaptcha: Use the global edit key for MobileFrontend edits if present (T420574)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 08:31 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1255736{{!}}hcaptcha: Use the global edit key for MobileFrontend edits if present (T420574)]]
* 08:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:19 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:18 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2005-2006,2011-2018,2033-2039,2041-2042,2044,2046,2049-2051,2055-2062,2064-2065,2067-2078,2087-2095,2102-2115,2124-2179,2184-2199].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 07:45 kartik@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254149{{!}}Enable ULS rewrite beta feature (T418187 T253303)]] (duration: 41m 30s)
* 07:42 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:33 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:30 kartik@deploy2002: kartik, abi: Continuing with sync
* 07:30 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:29 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:22 kartik@deploy2002: kartik, abi: Backport for [[gerrit:1254149{{!}}Enable ULS rewrite beta feature (T418187 T253303)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 07:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 07:17 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 07:16 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 07:03 kartik@deploy2002: Started scap sync-world: Backport for [[gerrit:1254149{{!}}Enable ULS rewrite beta feature (T418187 T253303)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 55s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-22 ==
* 02:50 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh7004.wikimedia.org with reason: depooled host
* 02:50 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh7003.wikimedia.org with reason: depooled host
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 21s)
* 02:01 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-20 ==
* 23:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2013.codfw.wmnet
* 23:30 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs2013.codfw.wmnet
* 22:34 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host lvs2013.codfw.wmnet
* 22:34 brett: Started pybal on lvs2013
* 22:27 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 21:57 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: trixie reimaging]
* 21:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5023.eqsin.wmnet with OS trixie
* 21:55 hashar: Upgrading CI Jenkins [[phab:T420477|T420477]]
* 21:25 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5023.eqsin.wmnet with reason: host reimage
* 21:21 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5023.eqsin.wmnet with reason: host reimage
* 21:04 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: debugging ipip
* 20:46 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5023.eqsin.wmnet with OS trixie
* 20:45 mutante: contint1003/2003 apt remove --purge apache2* ; apt remove --purge php* {{!}} [[phab:T418521|T418521]]
* 20:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 20:40 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 20:38 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5023.eqsin.wmnet with OS trixie
* 20:24 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh3006.wikimedia.org with reason: depooled host
* 20:24 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh3005.wikimedia.org with reason: depooled host
* 20:23 sukhe@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on doh3005.wikimedia.org with reason: depooled host
* 19:50 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: debugging ipip
* 19:33 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 19:30 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 19:21 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling reboot on A:tcpproxy and A:tcpproxy
* 19:16 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5023.eqsin.wmnet with OS trixie
* 19:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5023.eqsin.wmnet [reason: trixie reimaging]
* 19:16 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5021.eqsin.wmnet [reason: trixie reimaging]
* 19:14 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5021.eqsin.wmnet with OS trixie
* 18:52 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: reboot
* 18:43 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5021.eqsin.wmnet with reason: host reimage
* 18:39 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5021.eqsin.wmnet with reason: host reimage
* 18:28 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: reboot
* 18:16 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling reboot on A:tcpproxy and A:tcpproxy
* 18:14 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on db1253.eqiad.wmnet with reason: [[phab:T420041|T420041]]
* 17:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5021.eqsin.wmnet with OS trixie
* 17:54 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5021.eqsin.wmnet with OS trixie
* 17:51 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host lvs2014.codfw.wmnet
* 17:40 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on contint1003.wikimedia.org with reason: jenkins on java21
* 17:39 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet
* 16:54 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:54 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:33 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5021.eqsin.wmnet with OS trixie
* 16:32 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5021.eqsin.wmnet [reason: trixie reimaging]
* 16:09 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:08 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2002.codfw.wmnet
* 16:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2002.codfw.wmnet
* 15:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2001.codfw.wmnet
* 15:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:45 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2001.codfw.wmnet
* 15:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2041.codfw.wmnet
* 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2041.codfw.wmnet
* 15:32 cparle@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:32 cparle@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 15:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2040.codfw.wmnet
* 15:10 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2040.codfw.wmnet
* 15:02 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 15:01 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 15:00 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:59 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:58 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:58 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:57 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:56 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:55 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:50 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2002].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2002.codfw.wmnet
* 14:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2002.codfw.wmnet
* 14:44 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:44 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2002.codfw.wmnet
* 14:37 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2002.codfw.wmnet
* 14:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2001.codfw.wmnet
* 14:37 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2001.codfw.wmnet
* 14:36 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:35 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:35 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:34 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:30 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2001.codfw.wmnet
* 14:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2001.codfw.wmnet
* 14:29 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2002].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 14:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1335-1349].eqiad.wmnet
* 14:27 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1335-1349].eqiad.wmnet
* 14:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2039.codfw.wmnet
* 14:21 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2039.codfw.wmnet
* 14:16 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:16 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2038.codfw.wmnet
* 14:08 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2038.codfw.wmnet
* 13:54 jgreen@dns1004: END - running authdns-update
* 13:52 jgreen@dns1004: START - running authdns-update
* 13:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:39 inflatador: bking@deploy2002 restarting opensearch-ipoid cluster to apply new certificates
* 13:33 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 13:20 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:14 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-canary
* 13:14 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for doh[3005-3006].wikimedia.org
* 13:14 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for doh[3005-3006].wikimedia.org
* 13:08 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-canary
* 13:05 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:03 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:58 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2006.codfw.wmnet
* 12:56 cparle@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 12:55 cparle@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 12:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2006.codfw.wmnet
* 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2001.codfw.wmnet
* 12:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2001.codfw.wmnet
* 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1005.eqiad.wmnet
* 12:35 jiji@cumin1003: END (ERROR) - Cookbook sre.memcached.roll-reboot-restart (exit_code=97) rolling reboot on A:memcached-codfw
* 12:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1005.eqiad.wmnet
* 12:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw2001.wikimedia.org
* 12:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw2001.wikimedia.org
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw1001.wikimedia.org
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw1001.wikimedia.org
* 11:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 11:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 11:27 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 11:24 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 10:26 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 10:13 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:12 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:10 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:04 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:02 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:58 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:56 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:56 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:55 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:53 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:50 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:46 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:45 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:37 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:36 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:36 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:35 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:35 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:34 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:34 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:33 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:26 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:25 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:24 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:23 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:19 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:19 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:18 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:18 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:18 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:15 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:57 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 05:30 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5024.eqsin.wmnet [reason: trixie reimaging]
* 05:30 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5019.eqsin.wmnet [reason: trixie reimaging]
* 02:43 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on doh3005.wikimedia.org with reason: alerting is flapping
* 02:42 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on doh3006.wikimedia.org with reason: alerting is flapping
* 01:21 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5019.eqsin.wmnet with OS trixie
* 01:17 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5024.eqsin.wmnet with OS trixie
* 00:48 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
* 00:44 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 00:38 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
* 00:37 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 00:05 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
* 00:01 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5024.eqsin.wmnet with OS trixie
* 00:01 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
== 2026-03-19 ==
* 23:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5019.eqsin.wmnet with OS trixie
* 23:40 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255801{{!}}Make the handler follow the thumb steps (T414805)]] (duration: 06m 14s)
* 23:36 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 23:35 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1255801{{!}}Make the handler follow the thumb steps (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:33 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1255801{{!}}Make the handler follow the thumb steps (T414805)]]
* 22:48 zabe@deploy2002: mwscript-k8s job started: foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https # [[phab:T420643|T420643]]
* 22:19 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 22:18 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 22:17 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 22:16 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 22:08 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255886{{!}}Set WikiLambdaAbstractNamespaces's merge_strategy to provide_default (T420649)]] (duration: 06m 46s)
* 22:04 jforrester@deploy2002: jforrester: Continuing with sync
* 22:03 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255886{{!}}Set WikiLambdaAbstractNamespaces's merge_strategy to provide_default (T420649)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:01 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255886{{!}}Set WikiLambdaAbstractNamespaces's merge_strategy to provide_default (T420649)]]
* 21:58 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5019.eqsin.wmnet with OS trixie
* 21:57 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase-codfw
* 21:57 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5019.eqsin.wmnet [reason: trixie reimaging]
* 21:56 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
* 21:56 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5024.eqsin.wmnet [reason: trixie reimaging]
* 21:55 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255765{{!}}Implement addListener fallback for older browsers in matchMedia (T419717)]] (duration: 07m 17s)
* 21:51 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:49 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1255765{{!}}Implement addListener fallback for older browsers in matchMedia (T419717)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:48 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255765{{!}}Implement addListener fallback for older browsers in matchMedia (T419717)]]
* 21:29 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255881{{!}}Skins: Address issue with blurry images for large thumbnails (T375981)]] (duration: 07m 03s)
* 21:25 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:24 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1255881{{!}}Skins: Address issue with blurry images for large thumbnails (T375981)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:22 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255881{{!}}Skins: Address issue with blurry images for large thumbnails (T375981)]]
* 21:11 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup2020.codfw.wmnet with reason: kernel module reload
* 21:10 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 11 hosts with reason: kernel module reload
* 20:36 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254865{{!}}Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654{{!}}Deploy PRV to 13 wikis (T420273)]] (duration: 11m 00s)
* 20:32 kgraessle@deploy2002: kgraessle, arlolra: Continuing with sync
* 20:27 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1016.eqiad.wmnet
* 20:27 kgraessle@deploy2002: kgraessle, arlolra: Backport for [[gerrit:1254865{{!}}Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654{{!}}Deploy PRV to 13 wikis (T420273)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1254865{{!}}Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654{{!}}Deploy PRV to 13 wikis (T420273)]]
* 20:25 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1016.eqiad.wmnet
* 20:11 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs1016.eqiad.wmnet with reason: reboot
* 20:01 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:01 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add analytic vlan hostnames - cmooney@cumin1003"
* 20:01 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add analytic vlan hostnames - cmooney@cumin1003"
* 19:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1018.eqiad.wmnet
* 19:56 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:56 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1018.eqiad.wmnet
* 19:55 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:53 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 4.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.8.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa. on all recursors
* 19:53 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache 4.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.8.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa. on all recursors
* 19:52 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:51 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 7 hosts with reason: kernel module reload
* 19:44 topranks: disable IPv6 VRRP for et-1/0/5.1023 sub-interfaces on eqiad core routers [[phab:T405562|T405562]]
* 19:36 brett: stopping pybal/puppet on lvs1018 for reboots
* 19:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: reboots
* 19:00 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 6 hosts with reason: kernel module reload
* 19:00 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1019.eqiad.wmnet
* 19:00 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase-codfw
* 19:00 topranks: add vlan sub-interface for analytics1-d-eqiad vlan to leaf switches in eqiad row d [[phab:T405562|T405562]]
* 18:44 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs1019.eqiad.wmnet with reason: planned reboot
* 18:42 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:aqs-codfw
* 18:31 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255824{{!}}RepoBooks::onMediaWikiServices: Skip all low NSes, not just NS0 (T420617)]], [[gerrit:1255820{{!}}SpecialAbstractContent: Fix hard-coded policy list page namespace]], [[gerrit:1255794{{!}}[abstractwiki] Allow "Abstract:" as well as "Abstract Wikipedia:" as NS_PROJECT]] (duration: 06m 20s)
* 18:27 jforrester@deploy2002: jforrester: Continuing with sync
* 18:26 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255824{{!}}RepoBooks::onMediaWikiServices: Skip all low NSes, not just NS0 (T420617)]], [[gerrit:1255820{{!}}SpecialAbstractContent: Fix hard-coded policy list page namespace]], [[gerrit:1255794{{!}}[abstractwiki] Allow "Abstract:" as well as "Abstract Wikipedia:" as NS_PROJECT]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now b
* 18:24 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255824{{!}}RepoBooks::onMediaWikiServices: Skip all low NSes, not just NS0 (T420617)]], [[gerrit:1255820{{!}}SpecialAbstractContent: Fix hard-coded policy list page namespace]], [[gerrit:1255794{{!}}[abstractwiki] Allow "Abstract:" as well as "Abstract Wikipedia:" as NS_PROJECT]]
* 18:02 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
* 17:55 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 17:46 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:46 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:45 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host lvs1020.eqiad.wmnet
* 17:44 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 17:30 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 17:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy4004.wikimedia.org
* 17:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy4004.wikimedia.org with OS bookworm
* 17:26 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on contint1003.wikimedia.org with reason: jenkins on java21
* 17:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp5026.eqsin.wmnet
* 17:22 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:22 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:22 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:22 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:21 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp5026.eqsin.wmnet
* 17:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy4004.wikimedia.org with reason: host reimage
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts doh4002.wikimedia.org
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 17:08 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy4004.wikimedia.org with reason: host reimage
* 17:08 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 17:07 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:07 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:05 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp5026.eqsin.wmnet with reason: firmware updates
* 17:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 17:03 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5025.*
* 17:01 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp5025.eqsin.wmnet
* 16:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts doh4002.wikimedia.org
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts doh4001.wikimedia.org
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 16:59 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 16:58 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:57 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:57 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:56 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp5025.eqsin.wmnet
* 16:50 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts doh4001.wikimedia.org
* 16:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-master1004.eqiad.wmnet
* 16:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1151.eqiad.wmnet
* 16:45 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5026.eqsin.wmnet with OS trixie
* 16:44 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy4004.wikimedia.org with OS bookworm
* 16:44 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy4004.wikimedia.org on all recursors
* 16:43 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy4004.wikimedia.org on all recursors
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:42 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:42 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255039{{!}}Activate Abstract Wikipedia (T411723)]] (duration: 06m 09s)
* 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp5025.eqsin.wmnet with reason: firmware updates
* 16:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-master1004.eqiad.wmnet
* 16:40 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5025.eqsin.wmnet with OS trixie
* 16:39 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1151.eqiad.wmnet
* 16:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:39 jmm@cumin2002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 16:38 jforrester@deploy2002: jforrester: Continuing with sync
* 16:38 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255039{{!}}Activate Abstract Wikipedia (T411723)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:36 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255039{{!}}Activate Abstract Wikipedia (T411723)]]
* 16:35 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
* 16:33 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255779{{!}}Revert "[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki"]] (duration: 07m 19s)
* 16:29 jforrester@deploy2002: jforrester: Continuing with sync
* 16:28 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255779{{!}}Revert "[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:26 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255779{{!}}Revert "[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki"]]
* 16:25 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:24 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255773{{!}}[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki (T420531)]] (duration: 06m 06s)
* 16:23 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:aqs-codfw
* 16:21 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 16:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 16:20 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:20 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4004.wikimedia.org
* 16:20 fabfur@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=1) rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp2041*<nowiki>}</nowiki> and A:cp - 3.2 test upgrade ()
* 16:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp2041*<nowiki>}</nowiki> and A:cp - 3.2 test upgrade ()
* 16:20 jforrester@deploy2002: jforrester: Continuing with sync
* 16:19 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255773{{!}}[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki (T420531)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2003.codfw.wmnet
* 16:17 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255773{{!}}[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki (T420531)]]
* 16:17 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 16:17 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 16:17 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy4003.wikimedia.org
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy4003.wikimedia.org with OS bookworm
* 16:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1142.eqiad.wmnet
* 16:14 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk2003.codfw.wmnet
* 16:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2002.codfw.wmnet
* 16:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 16:10 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk2002.codfw.wmnet
* 16:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5025.eqsin.wmnet with reason: host reimage
* 16:08 brouberol@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:07 brouberol@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1142.eqiad.wmnet
* 16:06 brouberol@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2001.codfw.wmnet
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5025.eqsin.wmnet with reason: host reimage
* 16:05 brouberol@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk2001.codfw.wmnet
* 15:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy4003.wikimedia.org with reason: host reimage
* 15:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy4003.wikimedia.org with reason: host reimage
* 15:36 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief2002.codfw.wmnet
* 15:35 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5025.eqsin.wmnet with OS trixie
* 15:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5026.eqsin.wmnet with OS trixie
* 15:34 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5025.eqsin.wmnet with OS trixie
* 15:34 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5026.eqsin.wmnet with OS trixie
* 15:32 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief2002.codfw.wmnet
* 15:32 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief1002.eqiad.wmnet
* 15:32 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy4003.wikimedia.org with OS bookworm
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy4003.wikimedia.org on all recursors
* 15:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy4003.wikimedia.org on all recursors
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test2001.codfw.wmnet
* 15:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test1001.eqiad.wmnet
* 15:28 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief1002.eqiad.wmnet
* 15:26 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test2001.codfw.wmnet
* 15:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test1001.eqiad.wmnet
* 15:22 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255747{{!}}Hooks: Re-apply I52fc151ab88d79754baeff35d2c0f200ebe9fc9a]] (duration: 09m 55s)
* 15:22 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:22 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:22 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:21 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:21 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:21 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:18 phuedx@deploy2002: phuedx: Continuing with sync
* 15:18 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:17 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:17 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 15:16 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 15:16 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 15:14 phuedx@deploy2002: phuedx: Backport for [[gerrit:1255747{{!}}Hooks: Re-apply I52fc151ab88d79754baeff35d2c0f200ebe9fc9a]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 15:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4003.wikimedia.org
* 15:12 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1255747{{!}}Hooks: Re-apply I52fc151ab88d79754baeff35d2c0f200ebe9fc9a]]
* 15:11 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 15:10 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 15:10 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 15:09 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 15:09 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 15:09 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 15:07 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5026.eqsin.wmnet with OS trixie
* 15:06 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5025.eqsin.wmnet with OS trixie
* 15:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh4004.wikimedia.org
* 15:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh4004.wikimedia.org with OS bookworm
* 15:05 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk1003.eqiad.wmnet
* 15:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk1003.eqiad.wmnet
* 14:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk1002.eqiad.wmnet
* 14:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host matomo1003.eqiad.wmnet
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zookeeper-test1002.eqiad.wmnet
* 14:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk1002.eqiad.wmnet
* 14:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk1001.eqiad.wmnet
* 14:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host matomo1003.eqiad.wmnet
* 14:52 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host zookeeper-test1002.eqiad.wmnet
* 14:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk1001.eqiad.wmnet
* 14:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1006.eqiad.wmnet
* 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh4004.wikimedia.org with reason: host reimage
* 14:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-conf1006.eqiad.wmnet
* 14:43 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh4004.wikimedia.org with reason: host reimage
* 14:40 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 14:38 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 14:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1005.eqiad.wmnet
* 14:32 bking@cumin2002: conftool action : set/pooled=yes:weight=1; selector: name=dse-k8s-worker1010.eqiad.wmnet{{!}}dse-k8s-worker1011.eqiad.wmnet{{!}}dse-k8s-worker1012.eqiad.wmnet{{!}}dse-k8s-worker1013.eqiad.wmnet{{!}}dse-k8s-worker1015.eqiad.wmnet{{!}}dse-k8s-worker1016.eqiad.wmnet{{!}}dse-k8s-worker1017.eqiad.wmnet{{!}}dse-k8s-worker1018.eqiad.wmnet{{!}}dse-k8s-worker1019.eqiad.wmnet
* 14:29 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-conf1005.eqiad.wmnet
* 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1004.eqiad.wmnet
* 14:25 bking@cumin2002: conftool action : set/pooled=yes:weight=10; selector: name=dse-k8s-worker1012.eqiad.wmnet{{!}}dse-k8s-worker1015.eqiad.wmnet{{!}}dse-k8s-worker1016.eqiad.wmnet{{!}}dse-k8s-worker1017.eqiad.wmnet
* 14:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-conf1004.eqiad.wmnet
* 14:21 jmm@deploy2002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 14:20 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host doh4004.wikimedia.org with OS bookworm
* 14:20 jmm@deploy2002: helmfile [eqiad] START helmfile.d/services/proton: apply
* 14:19 jmm@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 14:18 jmm@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
* 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:17 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4004.wikimedia.org on all recursors
* 14:17 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4004.wikimedia.org on all recursors
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:13 jmm@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
* 14:12 jmm@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
* 14:11 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:04 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4004.wikimedia.org
* 14:03 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh4003.wikimedia.org
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh4003.wikimedia.org with OS bookworm
* 13:53 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:46 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250107{{!}}Expose new wikifunctions.v0 REST API module on Wikifunctions.org only (T419053)]] (duration: 06m 03s)
* 13:42 jforrester@deploy2002: jforrester: Continuing with sync
* 13:42 jforrester@deploy2002: jforrester: Backport for [[gerrit:1250107{{!}}Expose new wikifunctions.v0 REST API module on Wikifunctions.org only (T419053)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:40 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1250107{{!}}Expose new wikifunctions.v0 REST API module on Wikifunctions.org only (T419053)]]
* 13:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh4003.wikimedia.org with reason: host reimage
* 13:33 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh4003.wikimedia.org with reason: host reimage
* 13:22 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255686{{!}}CreateAccount: Add class to aide in instrumentation]], [[gerrit:1255685{{!}}createAccount: Log exposure and CTRs for account creation experiment (T419916)]] (duration: 12m 58s)
* 13:22 moritzm: upgrade rpki1001 to Routinator 0.15.1 [[phab:T420572|T420572]]
* 13:15 urbanecm@deploy2002: migr, urbanecm: Continuing with sync
* 13:13 urbanecm@deploy2002: migr, urbanecm: Backport for [[gerrit:1255686{{!}}CreateAccount: Add class to aide in instrumentation]], [[gerrit:1255685{{!}}createAccount: Log exposure and CTRs for account creation experiment (T419916)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:12 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host doh4003.wikimedia.org with OS bookworm
* 13:09 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1255686{{!}}CreateAccount: Add class to aide in instrumentation]], [[gerrit:1255685{{!}}createAccount: Log exposure and CTRs for account creation experiment (T419916)]]
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4003.wikimedia.org - jmm@cumin2002"
* 13:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4003.wikimedia.org - jmm@cumin2002"
* 13:07 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 17 hosts with reason: upgrade
* 13:01 moritzm: installing rsync security updates
* 12:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1017.eqiad.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm7001.magru.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:59 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 12:57 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 12:54 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 12:53 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1017.eqiad.wmnet
* 12:53 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1016.eqiad.wmnet
* 12:52 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 12:52 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 12:51 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 12:51 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 12:50 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 12:50 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 12:50 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 12:50 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 12:49 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 12:48 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 12:48 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 12:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1016.eqiad.wmnet
* 12:47 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:46 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:46 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 12:46 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:46 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 12:43 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 12:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 12:43 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb2002.codfw.wmnet
* 12:41 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 12:41 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm7001.magru.wmnet
* 12:41 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 12:41 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 12:40 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 12:40 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 12:39 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 12:39 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 12:38 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 12:37 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 12:37 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:37 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
* 12:37 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4003.wikimedia.org
* 12:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb2002.codfw.wmnet
* 12:29 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:27 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:25 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:25 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:24 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:23 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:22 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:22 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:10 urbanecm@deploy2002: mwscript-k8s job started: GrowthExperiments:reassignMentees --wiki=enwiki --mentor=Bilorv --performer=Bilorv --as-job # [[phab:T418194|T418194]]
* 11:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 11:59 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:58 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
* 11:57 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 11:53 moritzm: upgrade rpki2003 to Routinator 0.15.1 [[phab:T420572|T420572]]
* 11:46 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:40 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:40 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1017.eqiad.wmnet with reason: host reimage
* 11:26 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1017.eqiad.wmnet with reason: host reimage
* 11:18 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 11:18 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
* 11:11 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 11:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-codfw
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 10:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 10:55 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5]*<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 10:54 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host doh4003.wikimedia.org
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 10:54 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2004.codfw.wmnet
* 10:51 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 10:50 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 10:50 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:50 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2004.codfw.wmnet
* 10:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2005.codfw.wmnet
* 10:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema2004.codfw.wmnet
* 10:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2005.codfw.wmnet
* 10:43 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:43 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4003.wikimedia.org
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4007.ulsfo.wmnet to cluster ulsfo02 and group 01
* 10:42 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4007.ulsfo.wmnet to cluster ulsfo02 and group 01
* 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema2004.codfw.wmnet
* 10:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema2003.codfw.wmnet
* 10:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema2003.codfw.wmnet
* 10:37 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1002.wikimedia.org
* 10:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema1004.eqiad.wmnet
* 10:36 Raine: created temporary categorylinks_icu72 tables -- [[phab:T419980|T419980]], [[phab:T419049|T419049]]
* 10:36 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 10:34 aokoth@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:33 aokoth@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:32 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema1004.eqiad.wmnet
* 10:32 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:31 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4007.ulsfo.wmnet
* 10:29 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5]*<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 10:28 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:28 fnegri@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1002.wikimedia.org
* 10:26 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2006.codfw.wmnet
* 10:25 btullis@cumin1003: END (PASS) - Cookbook sre.opensearch.roll-restart-reboot (exit_code=0) rolling reboot on A:datahubsearch
* 10:24 btullis@cumin1003: END (FAIL) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=99) rolling reboot on A:cephosd-eqiad
* 10:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4007.ulsfo.wmnet
* 10:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2006.codfw.wmnet
* 10:21 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1001.wikimedia.org
* 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2008.wikimedia.org
* 10:19 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:18 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2008.wikimedia.org
* 10:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2007.codfw.wmnet
* 10:13 fnegri@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1001.wikimedia.org
* 10:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2007.codfw.wmnet
* 10:09 btullis@cumin1003: START - Cookbook sre.opensearch.roll-restart-reboot rolling reboot on A:datahubsearch
* 10:04 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:03 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4007.ulsfo.wmnet with OS bookworm
* 09:58 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 17 hosts with reason: upgrade
* 09:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema1003.eqiad.wmnet
* 09:53 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema1003.eqiad.wmnet
* 09:46 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]] (duration: 01m 07s)
* 09:45 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]]
* 09:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 09:43 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]] (duration: 00m 59s)
* 09:42 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]]
* 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4007.ulsfo.wmnet with reason: host reimage
* 09:35 moritzm: installing libnginx-mod-http-lua security updates
* 09:35 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4007.ulsfo.wmnet with reason: host reimage
* 09:29 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 09:26 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 09:26 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 09:24 klausman@cumin2002: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-codfw
* 09:21 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:21 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:19 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:19 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:13 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4007.ulsfo.wmnet with OS bookworm
* 09:11 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 09:01 moritzm: remove ganeti4007 from classic Ganeti cluster in ulsfo [[phab:T418993|T418993]]
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of doh4001.wikimedia.org to plain
* 08:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of doh4001.wikimedia.org to plain
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of doh4002.wikimedia.org to plain
* 08:46 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of doh4002.wikimedia.org to plain
* 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4001.wikimedia.org to plain
* 08:45 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4001.wikimedia.org to plain
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4002.wikimedia.org to plain
* 08:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4002.wikimedia.org to plain
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of install4003.wikimedia.org to plain
* 08:42 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of install4003.wikimedia.org to plain
* 08:40 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:38 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host doh4003.wikimedia.org
* 08:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 08:38 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 08:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:38 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:38 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 08:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:31 moritzm: installing python-apt security updates
* 08:29 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:29 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4003.wikimedia.org
* 08:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 08:14 moritzm: installing imagemagick security updates on Bullseye
* 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
* 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
* 08:12 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 08:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
* 07:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
* 07:17 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 07:17 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 07:16 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 07:16 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 07:14 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 07:14 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 04:53 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 00:06 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon2003.codfw.wmnet
* 00:02 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon2003.codfw.wmnet
* 00:01 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon1003.eqiad.wmnet
== 2026-03-18 ==
* 23:58 mutante: releases2003 - kill 782 (stunnel4) - systemctl start stunnel4 - fix [[phab:T420246|T420246]] [[phab:T420388|T420388]] [[phab:T420411|T420411]]
* 23:57 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon1003.eqiad.wmnet
* 23:49 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:cassandra-dev
* 23:23 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:cassandra-dev
* 23:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5017.*
* 23:02 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5020.*
* 23:01 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5028.*
* 22:40 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5028.eqsin.wmnet with OS trixie
* 22:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS trixie
* 22:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 22:04 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 21:51 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-eqiad
* 21:49 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox
* 21:49 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7002.wikimedia.org
* 21:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 21:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5027.*
* 21:40 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 21:31 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 21:30 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5028.eqsin.wmnet with OS trixie
* 21:30 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7002.wikimedia.org
* 21:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5027.eqsin.wmnet with OS trixie
* 21:27 jforrester@deploy2002: mwscript-k8s job started: extensions/WikimediaMaintenance/maintenance/addWiki.php --wiki=abstractwiki # [[phab:T411723|T411723]] addWiki.php run
* 21:26 jforrester@deploy2002: mwscript-k8s job started: extensions/WikimediaMaintenance/maintenance/addWiki.php --wiki=abstractwiki # [[phab:T411723|T411723]] addWiki.php run
* 21:24 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255034{{!}}Revert "OrchestratorRequest: Switch evaluations to v2 endpoint" (T418887)]], [[gerrit:1247650{{!}}Create Abstract Wikipedia (T411725 T411726)]] (duration: 06m 44s)
* 21:20 jforrester@deploy2002: jforrester: Continuing with sync
* 21:20 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255034{{!}}Revert "OrchestratorRequest: Switch evaluations to v2 endpoint" (T418887)]], [[gerrit:1247650{{!}}Create Abstract Wikipedia (T411725 T411726)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:17 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255034{{!}}Revert "OrchestratorRequest: Switch evaluations to v2 endpoint" (T418887)]], [[gerrit:1247650{{!}}Create Abstract Wikipedia (T411725 T411726)]]
* 21:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5017.eqsin.wmnet with OS trixie
* 21:15 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7001.wikimedia.org
* 21:12 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS trixie
* 21:08 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255013{{!}}Guard for JS null deref on empty Parsoid sections (T419721)]], [[gerrit:1254889{{!}}Reapply "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] (duration: 11m 20s)
* 21:07 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 21:07 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5028.eqsin.wmnet with OS trixie
* 21:04 jdlrobson@deploy2002: jdlrobson, harroyo-wmf: Continuing with sync
* 20:59 jdlrobson@deploy2002: jdlrobson, harroyo-wmf: Backport for [[gerrit:1255013{{!}}Guard for JS null deref on empty Parsoid sections (T419721)]], [[gerrit:1254889{{!}}Reapply "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:59 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-eqiad
* 20:58 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7001.wikimedia.org
* 20:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5027.eqsin.wmnet with reason: host reimage
* 20:57 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255013{{!}}Guard for JS null deref on empty Parsoid sections (T419721)]], [[gerrit:1254889{{!}}Reapply "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]]
* 20:52 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-codfw
* 20:51 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5027.eqsin.wmnet with reason: host reimage
* 20:51 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in1001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:50 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5020.eqsin.wmnet with OS trixie
* 20:50 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in2001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:49 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 20:48 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5028.eqsin.wmnet with OS trixie
* 20:44 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5017.eqsin.wmnet with reason: host reimage
* 20:43 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns6002.wikimedia.org
* 20:42 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out1001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:42 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1033.eqiad.wmnet with OS trixie
* 20:42 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 20:42 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 20:38 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254956{{!}}Limit legacy postprocessing cache to pages where DT does apply (T376183)]] (duration: 13m 54s)
* 20:37 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5017.eqsin.wmnet with reason: host reimage
* 20:35 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out2001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:34 cscott@deploy2002: cscott: Continuing with sync
* 20:33 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns6002.wikimedia.org
* 20:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and not P<nowiki>{</nowiki>cp2042.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 20:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
* 20:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and not P<nowiki>{</nowiki>cp2041.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 20:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
* 20:26 cscott@deploy2002: cscott: Backport for [[gerrit:1254956{{!}}Limit legacy postprocessing cache to pages where DT does apply (T376183)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1033.eqiad.wmnet with reason: host reimage
* 20:24 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1254956{{!}}Limit legacy postprocessing cache to pages where DT does apply (T376183)]]
* 20:24 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5027.eqsin.wmnet with OS trixie
* 20:22 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5029.*
* 20:21 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5029.eqsin.wmnet with OS trixie
* 20:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1033.eqiad.wmnet with reason: host reimage
* 20:18 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns6001.wikimedia.org
* 20:14 kemayo@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254965{{!}}Editcheck: fix tagging not happening for non-default checks]] (duration: 06m 28s)
* 20:10 kemayo@deploy2002: kemayo: Continuing with sync
* 20:10 kemayo@deploy2002: kemayo: Backport for [[gerrit:1254965{{!}}Editcheck: fix tagging not happening for non-default checks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1033.eqiad.wmnet with OS trixie
* 20:08 kemayo@deploy2002: Started scap sync-world: Backport for [[gerrit:1254965{{!}}Editcheck: fix tagging not happening for non-default checks]]
* 20:06 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 20:05 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns6001.wikimedia.org
* 20:05 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5017.eqsin.wmnet with OS trixie
* 20:05 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.*
* 20:05 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw
* 19:51 Reedy: running `foreachwikiindblist fishbowl.dblist extensions/OATHAuth/maintenance/UpdateSecretsToEncryptedFormat.php` [[phab:T404363|T404363]]
* 19:51 Reedy: running `foreachwikiindblist private.dblist extensions/OATHAuth/maintenance/UpdateSecretsToEncryptedFormat.php` [[phab:T404363|T404363]]
* 19:50 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns5004.wikimedia.org
* 19:50 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
* 19:50 Reedy: running `mwscript extensions/OATHAuth/maintenance/UpdateSecretsToEncryptedFormat.php --wiki=metawiki` [[phab:T404363|T404363]]
* 19:49 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
* 19:49 reedy@deploy2002: Synchronized private/PrivateSettings.php: Set $wgOATHSecretKey [[phab:T404363|T404363]] (duration: 05m 51s)
* 19:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
* 19:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5030.eqsin.wmnet with OS trixie
* 19:42 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
* 19:39 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5017.eqsin.wmnet with OS trixie
* 19:35 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 19:35 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 19:33 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns5004.wikimedia.org
* 19:30 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install4004.wikimedia.org with OS bookworm
* 19:29 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS trixie
* 19:28 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5020.eqsin.wmnet [reason: trixie reimaging]
* 19:28 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5018.eqsin.wmnet [reason: trixie reimaging]
* 19:27 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 19:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 19:26 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5018.eqsin.wmnet with OS trixie
* 19:23 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:23 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:18 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns5003.wikimedia.org
* 19:17 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:13 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:13 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install4004.wikimedia.org with reason: host reimage
* 19:11 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
* 19:08 brett@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp5029.eqsin.wmnet
* 19:08 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
* 19:08 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS trixie
* 19:08 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on install4004.wikimedia.org with reason: host reimage
* 19:02 brett@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp5029.eqsin.wmnet
* 19:01 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns5003.wikimedia.org
* 18:56 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5031.*
* 18:56 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5031.eqsin.wmnet with OS trixie
* 18:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 18:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 18:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5030.eqsin.wmnet with OS trixie
* 18:46 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns4004.wikimedia.org
* 18:45 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host install4004.wikimedia.org with OS bookworm
* 18:45 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 18:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
* 18:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
* 18:29 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
* 18:27 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns4004.wikimedia.org
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
* 18:18 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5017.eqsin.wmnet with OS trixie
* 18:17 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
* 18:17 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5017.eqsin.wmnet [reason: trixie reimaging]
* 18:17 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5018.eqsin.wmnet with OS trixie
* 18:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5018.eqsin.wmnet [reason: trixie reimaging]
* 18:16 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5017.eqsin.wmnet [reason: trixie reimaging]
* 18:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5017.eqsin.wmnet [reason: trixie reimaging]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3077.esams.wmnet [reason: trixie reimaging]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 18:12 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 18:12 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns4003.wikimedia.org
* 18:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1258: Ready
* 18:07 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 18:01 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 17:59 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3078.esams.wmnet with OS trixie
* 17:56 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3077.esams.wmnet with OS trixie
* 17:55 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns4003.wikimedia.org
* 17:54 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
* 17:51 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
* 17:43 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:42 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:40 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns3004.wikimedia.org
* 17:39 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:38 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backupmon1001.eqiad.wmnet with reason: upgrade
* 17:35 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1347.eqiad.wmnet with OS trixie
* 17:33 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 17:32 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 17:32 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5031.eqsin.wmnet with OS trixie
* 17:32 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5032.eqsin.wmnet with OS trixie
* 17:32 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5031.eqsin.wmnet with OS trixie
* 17:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 17:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 17:30 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3077.esams.wmnet with reason: host reimage
* 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:29 claime: rearmed keyholder on deploy1003
* 17:28 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:27 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 17:26 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3077.esams.wmnet with reason: host reimage
* 17:25 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:25 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1258: Ready
* 17:23 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns3004.wikimedia.org
* 17:21 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 17:21 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 17:21 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:20 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:20 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-esams and A:ncredir
* 17:19 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1347.eqiad.wmnet with reason: host reimage
* 17:18 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-drmrs and A:ncredir
* 17:16 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-eqiad and A:ncredir
* 17:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-ulsfo and A:ncredir
* 17:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
* 17:15 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 17:14 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1347.eqiad.wmnet with reason: host reimage
* 17:14 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5031.eqsin.wmnet with OS trixie
* 17:12 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 17:12 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
* 17:11 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 17:09 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 17:09 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3078.*
* 17:08 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 17:08 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 17:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3079.*
* 17:08 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns3003.wikimedia.org
* 17:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3078.*
* 17:07 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-eqiad and A:ncredir
* 17:07 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
* 17:07 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-esams and A:ncredir
* 17:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
* 17:06 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir2002.*
* 17:05 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-drmrs and A:ncredir
* 17:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir2002.codfw.wmnet
* 17:05 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-eqsin and A:ncredir
* 17:05 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-ulsfo and A:ncredir
* 17:04 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-magru and A:ncredir
* 17:03 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3078.esams.wmnet with OS trixie
* 17:02 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1347
* 17:02 jayme@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1347
* 17:02 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 17:01 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3076.esams.wmnet [reason: trixie reimaging]
* 17:01 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3077.esams.wmnet with OS trixie
* 17:01 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3077.esams.wmnet [reason: trixie reimaging]
* 16:59 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncredir2002.codfw.wmnet
* 16:58 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir2002.*
* 16:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 8 hosts with reason: upgrade
* 16:55 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir2001.*
* 16:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ncredir2001.codfw.wmnet
* 16:55 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for ncredir2001.codfw.wmnet
* 16:55 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3076.esams.wmnet with OS trixie
* 16:53 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2014.codfw.wmnet
* 16:52 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-eqsin and A:ncredir
* 16:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2008.codfw.wmnet with reason: kernel update
* 16:51 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir
* 16:51 klausman@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on ml-serve1013.eqiad.wmnet with reason: Reboot for security update
* 16:50 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2013.codfw.wmnet
* 16:49 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir2001.*
* 16:49 brett@cumin2002: END (ERROR) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=97) rolling reboot on A:ncredir and A:ncredir
* 16:48 jayme@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1347
* 16:48 jayme@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1347.eqiad.wmnet 199.48.64.10.in-addr.arpa 9.9.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 16:48 jayme@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1347.eqiad.wmnet 199.48.64.10.in-addr.arpa 9.9.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 16:48 jayme@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:47 jayme@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1347 - jayme@cumin1003"
* 16:47 jayme@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1347 - jayme@cumin1003"
* 16:47 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns3003.wikimedia.org
* 16:47 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1012.eqiad.wmnet
* 16:47 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and A:ncredir
* 16:47 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2012.codfw.wmnet
* 16:47 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2014.codfw.wmnet
* 16:46 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3075.esams.wmnet [reason: trixie reimaging]
* 16:46 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2003.codfw.wmnet
* 16:45 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
* 16:44 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2013.codfw.wmnet
* 16:44 jayme@cumin1003: START - Cookbook sre.dns.netbox
* 16:43 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2009.codfw.wmnet
* 16:43 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1347
* 16:43 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
* 16:43 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1347.eqiad.wmnet with OS trixie
* 16:43 brett@cumin2002: START - Cookbook sre.hosts.reboot-cluster
* 16:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2007.codfw.wmnet with reason: kernel update
* 16:40 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2012.codfw.wmnet
* 16:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3079.esams.wmnet with OS trixie
* 16:39 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2008.codfw.wmnet
* 16:38 moritzm: installing PHP 8.2 security updates
* 16:37 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2009.codfw.wmnet
* 16:36 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
* 16:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3078.esams.wmnet with OS trixie
* 16:34 moritzm: installing alsa-lib security updates
* 16:33 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
* 16:32 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2008.codfw.wmnet
* 16:32 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns2006.wikimedia.org
* 16:29 moritzm: failover Ganeti master in eqiad to ganeti1046
* 16:29 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3076.esams.wmnet with reason: host reimage
* 16:29 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2003.codfw.wmnet
* 16:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
* 16:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
* 16:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2005.codfw.wmnet with reason: kernel update
* 16:24 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3076.esams.wmnet with reason: host reimage
* 16:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
* 16:22 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1012.eqiad.wmnet
* 16:20 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1013.eqiad.wmnet
* 16:19 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1011.eqiad.wmnet
* 16:18 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
* 16:18 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 16:16 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host install4004.wikimedia.org with OS bookworm
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
* 16:14 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns2006.wikimedia.org
* 16:14 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1013.eqiad.wmnet
* 16:14 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1009.eqiad.wmnet
* 16:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3079.esams.wmnet with reason: host reimage
* 16:13 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1011.eqiad.wmnet
* 16:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1029.eqiad.wmnet with reason: kernel update
* 16:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1016.eqiad.wmnet with OS bookworm
* 16:12 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 16:11 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 16:11 moritzm: powercycling ganeti1053 (stuck on reboot)
* 16:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 16:09 klausman@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad
* 16:09 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 16:08 klausman@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad
* 16:07 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1009.eqiad.wmnet
* 16:07 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1003.eqiad.wmnet
* 16:06 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3079.esams.wmnet with reason: host reimage
* 16:06 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 16:06 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 16:04 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 16:04 klausman@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad
* 16:04 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 16:02 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 16:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1028.eqiad.wmnet with reason: kernel update
* 16:00 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1003.eqiad.wmnet
* 16:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1015.eqiad.wmnet
* 16:00 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3075.esams.wmnet with OS trixie
* 16:00 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3076.esams.wmnet with OS trixie
* 15:59 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns2005.wikimedia.org
* 15:58 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3076.esams.wmnet [reason: trixie reimaging]
* 15:58 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1012.eqiad.wmnet
* 15:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
* 15:58 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 15:57 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 15:57 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1010.eqiad.wmnet
* 15:57 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1008.eqiad.wmnet
* 15:57 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3074.esams.wmnet [reason: trixie reimaging]
* 15:56 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 15:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
* 15:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
* 15:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1017.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1023.eqiad.wmnet with reason: kernel update
* 15:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1015.eqiad.wmnet
* 15:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy1022.eqiad.wmnet
* 15:54 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1008.eqiad.wmnet
* 15:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy1022.eqiad.wmnet
* 15:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
* 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
* 15:52 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1010.eqiad.wmnet
* 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
* 15:51 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1012.eqiad.wmnet
* 15:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3074.esams.wmnet with OS trixie
* 15:49 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and not P<nowiki>{</nowiki>cp2042.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 15:48 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1014.eqiad.wmnet
* 15:48 klausman@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on A:ml-serve-worker-eqiad
* 15:47 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy1003.eqiad.wmnet
* 15:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
* 15:46 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and not P<nowiki>{</nowiki>cp2041.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 15:45 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1017.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:45 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns2005.wikimedia.org
* 15:42 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1014.eqiad.wmnet
* 15:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
* 15:41 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3079.esams.wmnet with OS trixie
* 15:41 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3078.esams.wmnet with OS trixie
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
* 15:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1022.eqiad.wmnet with reason: kernel update
* 15:38 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Update for dse-k8s-worker1016 - btullis@cumin1003"
* 15:37 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Update for dse-k8s-worker1016 - btullis@cumin1003"
* 15:37 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy1003.eqiad.wmnet
* 15:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1027.eqiad.wmnet
* 15:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1016.eqiad.wmnet with reason: host reimage
* 15:35 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-eqiad
* 15:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
* 15:34 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3075.esams.wmnet with reason: host reimage
* 15:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1372.eqiad.wmnet
* 15:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1371.eqiad.wmnet
* 15:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1370.eqiad.wmnet
* 15:31 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1027.eqiad.wmnet
* 15:30 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns2004.wikimedia.org
* 15:29 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1016.eqiad.wmnet with reason: host reimage
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1369.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1368.eqiad.wmnet
* 15:28 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1372.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1367.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1366.eqiad.wmnet
* 15:28 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1371.eqiad.wmnet
* 15:28 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1370.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1365.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1364.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1363.eqiad.wmnet
* 15:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1362.eqiad.wmnet
* 15:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1361.eqiad.wmnet
* 15:27 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1017
* 15:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1360.eqiad.wmnet
* 15:26 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1017
* 15:25 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3074.esams.wmnet with reason: host reimage
* 15:25 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install4004.wikimedia.org with OS bookworm
* 15:25 sukhe@dns1004: END - running authdns-update
* 15:24 sukhe@dns1004: START - running authdns-update
* 15:24 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install4004.wikimedia.org
* 15:24 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host install4004.wikimedia.org with OS bookworm
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1369.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1368.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1367.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1366.eqiad.wmnet
* 15:23 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1365.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1364.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1363.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1362.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1361.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1360.eqiad.wmnet
* 15:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1349.eqiad.wmnet
* 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
* 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
* 15:18 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3075.esams.wmnet with reason: host reimage
* 15:18 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3074.esams.wmnet with reason: host reimage
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1348.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1346.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1344.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1345.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1343.eqiad.wmnet
* 15:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1342.eqiad.wmnet
* 15:16 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns2004.wikimedia.org
* 15:15 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1349.eqiad.wmnet
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1341.eqiad.wmnet
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1340.eqiad.wmnet
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1339.eqiad.wmnet
* 15:15 moritzm: imported jenkins 2.541.3 for bullseye/bookworm/trixie
* 15:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1338.eqiad.wmnet
* 15:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1016.eqiad.wmnet with OS bookworm
* 15:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1348.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1346.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1336.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1337.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1345.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1344.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1334.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1335.eqiad.wmnet
* 15:11 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1343.eqiad.wmnet
* 15:11 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1342.eqiad.wmnet
* 15:11 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1332.eqiad.wmnet
* 15:11 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1333.eqiad.wmnet
* 15:11 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 15:11 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 15:10 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1341.eqiad.wmnet
* 15:10 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1340.eqiad.wmnet
* 15:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1331.eqiad.wmnet
* 15:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1330.eqiad.wmnet
* 15:10 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1339.eqiad.wmnet
* 15:09 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1329.eqiad.wmnet
* 15:09 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1338.eqiad.wmnet
* 15:09 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1328.eqiad.wmnet
* 15:09 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
* 15:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
* 15:07 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1337.eqiad.wmnet
* 15:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
* 15:07 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1336.eqiad.wmnet
* 15:07 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1335.eqiad.wmnet
* 15:06 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1334.eqiad.wmnet
* 15:06 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1333.eqiad.wmnet
* 15:06 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1332.eqiad.wmnet
* 15:05 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1331.eqiad.wmnet
* 15:05 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1330.eqiad.wmnet
* 15:04 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1329.eqiad.wmnet
* 15:04 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1328.eqiad.wmnet
* 15:03 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 15:02 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1033.eqiad.wmnet with OS trixie
* 15:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
* 15:01 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns1006.wikimedia.org
* 14:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum4002.ulsfo.wmnet
* 14:58 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:57 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 14:54 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3075.esams.wmnet with OS trixie
* 14:54 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3075.esams.wmnet [reason: trixie reimaging]
* 14:54 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3074.esams.wmnet with OS trixie
* 14:53 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3074.esams.wmnet [reason: trixie reimaging]
* 14:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
* 14:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
* 14:50 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:49 slyngshede@dns1004: END - running authdns-update
* 14:48 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:48 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:48 slyngshede@dns1004: START - running authdns-update
* 14:47 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns1006.wikimedia.org
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum4002.ulsfo.wmnet
* 14:45 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-eqiad
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum4001.ulsfo.wmnet
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:44 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
* 14:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
* 14:40 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:38 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
* 14:36 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum4001.ulsfo.wmnet
* 14:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install4004.wikimedia.org with OS bookworm
* 14:33 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
* 14:32 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns1005.wikimedia.org
* 14:32 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "inline pattern and pattern equivalence - oblivian@cumin1003"
* 14:32 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: inline pattern and pattern equivalence - oblivian@cumin1003
* 14:31 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: inline pattern and pattern equivalence - oblivian@cumin1003
* 14:31 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "inline pattern and pattern equivalence - oblivian@cumin1003"
* 14:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
* 14:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install4004.wikimedia.org - jmm@cumin2002"
* 14:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install4004.wikimedia.org - jmm@cumin2002"
* 14:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install4004.wikimedia.org on all recursors
* 14:24 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install4004.wikimedia.org on all recursors
* 14:21 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install4004.wikimedia.org on all recursors
* 14:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install4004.wikimedia.org on all recursors
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install4004.wikimedia.org - jmm@cumin2002"
* 14:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install4004.wikimedia.org - jmm@cumin2002"
* 14:19 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns1005.wikimedia.org
* 14:17 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254911{{!}}Restore quotation-marks in ext.wikilambda.app messages (T420456)]] (duration: 06m 32s)
* 14:17 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:17 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS bookworm
* 14:17 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:16 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:16 apine@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install4004.wikimedia.org
* 14:15 apine@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:15 apine@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:14 apine@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:14 apine@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:14 jforrester@deploy2002: jforrester: Continuing with sync
* 14:13 apine@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:13 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:13 jforrester@deploy2002: jforrester: Backport for [[gerrit:1254911{{!}}Restore quotation-marks in ext.wikilambda.app messages (T420456)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:13 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:11 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1254911{{!}}Restore quotation-marks in ext.wikilambda.app messages (T420456)]]
* 14:08 apine@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 apine@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 apine@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:06 apine@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:05 XioNoX: set graceful-shutdown on EdgeUno transit sessions
* 14:05 apine@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:04 apine@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:04 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns1004.wikimedia.org
* 14:02 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
* 14:01 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 14:01 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 13:57 Msz2001: UTC afternoon backport+config window done
* 13:56 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254876{{!}}Tweak configuration of external link aggregate usage analysis (T419837)]] (duration: 06m 41s)
* 13:55 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:53 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 13:52 mszwarc@deploy2002: mszwarc: Continuing with sync
* 13:51 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1254876{{!}}Tweak configuration of external link aggregate usage analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:50 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns1004.wikimedia.org
* 13:50 sukhe@cumin1003: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox
* 13:49 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1254876{{!}}Tweak configuration of external link aggregate usage analysis (T419837)]]
* 13:49 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254916{{!}}Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917{{!}}Normalize external domain names in click analysis (T419837)]] (duration: 07m 23s)
* 13:46 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 13:45 mszwarc@deploy2002: mszwarc: Continuing with sync
* 13:43 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1254916{{!}}Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917{{!}}Normalize external domain names in click analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2096.codfw.wmnet
* 13:41 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1033.eqiad.wmnet with OS trixie
* 13:41 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS bookworm
* 13:41 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1254916{{!}}Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917{{!}}Normalize external domain names in click analysis (T419837)]]
* 13:40 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248095{{!}}filebackend: Remove outdated comment]], [[gerrit:1254216{{!}}GrowthExperiments: increase edit and thanks query limit II (T341599)]] (duration: 08m 47s)
* 13:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
* 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
* 13:39 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 13:39 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet
* 13:37 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2096.codfw.wmnet
* 13:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2095.codfw.wmnet
* 13:36 sgimeno@deploy2002: matmarex, sgimeno: Continuing with sync
* 13:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
* 13:33 sgimeno@deploy2002: matmarex, sgimeno: Backport for [[gerrit:1248095{{!}}filebackend: Remove outdated comment]], [[gerrit:1254216{{!}}GrowthExperiments: increase edit and thanks query limit II (T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:31 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet
* 13:31 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1095.eqiad.wmnet
* 13:31 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1248095{{!}}filebackend: Remove outdated comment]], [[gerrit:1254216{{!}}GrowthExperiments: increase edit and thanks query limit II (T341599)]]
* 13:30 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2095.codfw.wmnet
* 13:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2094.codfw.wmnet
* {{safesubst:SAL entry|1=13:28 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254894{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254895{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254891{{!}}Revert "SpecialPreferences: Use Language Select Widget in language field" (T419895)]], [[gerrit:1254890{{!}}Revert "SpecialPreferences: Use Language Select Widget in lan}}
* 13:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 13:28 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 13:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1026.eqiad.wmnet
* 13:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
* 13:26 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 13:25 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1095.eqiad.wmnet
* 13:25 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1094.eqiad.wmnet
* 13:24 sgimeno@deploy2002: somerandomdeveloper, sgimeno: Continuing with sync
* {{safesubst:SAL entry|1=13:24 sgimeno@deploy2002: somerandomdeveloper, sgimeno: Backport for [[gerrit:1254894{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254895{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254891{{!}}Revert "SpecialPreferences: Use Language Select Widget in language field" (T419895)]], [[gerrit:1254890{{!}}Revert "SpecialPreferences: Use Language Select Widget in}}
* 13:23 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2094.codfw.wmnet
* 13:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2093.codfw.wmnet
* {{safesubst:SAL entry|1=13:22 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1254894{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254895{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254891{{!}}Revert "SpecialPreferences: Use Language Select Widget in language field" (T419895)]], [[gerrit:1254890{{!}}Revert "SpecialPreferences: Use Language Select Widget in lang}}
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
* 13:21 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1026.eqiad.wmnet
* 13:20 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:16 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2093.codfw.wmnet
* 13:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2092.codfw.wmnet
* 13:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
* 13:16 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1094.eqiad.wmnet
* 13:16 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1093.eqiad.wmnet
* 13:15 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:15 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:15 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
* 13:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:10 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:09 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1093.eqiad.wmnet
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
* 13:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2092.codfw.wmnet
* 13:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1027.eqiad.wmnet with reason: host reimage
* 13:08 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1016
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:06 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1016
* 13:06 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:04 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1027.eqiad.wmnet with reason: host reimage
* 13:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
* 13:02 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1012.eqiad.wmnet
* 12:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet with OS bookworm
* 12:58 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 12:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
* 12:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1015.eqiad.wmnet
* 12:55 ayounsi@dns1004: END - running authdns-update
* 12:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1012.eqiad.wmnet
* 12:54 ayounsi@dns1004: START - running authdns-update
* 12:53 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1012.eqiad.wmnet
* 12:53 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 12:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1015.eqiad.wmnet
* 12:50 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 12:50 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
* 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
* 12:47 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-codfw
* 12:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
* 12:42 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-jumbo-eqiad
* 12:38 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:37 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:37 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:36 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 12:35 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1372].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 12:33 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:32 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:31 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:30 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1020.eqiad.wmnet with reason: host reimage
* 12:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
* 12:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
* 12:25 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1020.eqiad.wmnet with reason: host reimage
* 12:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
* 12:25 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 12:25 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 12:25 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:24 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Update for dse-k8s-worker1015 - btullis@cumin1003"
* 12:24 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Update for dse-k8s-worker1015 - btullis@cumin1003"
* 12:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 12:22 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:21 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
* 12:19 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:19 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host serpens.wikimedia.org
* 12:13 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254851{{!}}Enable autodemotion for 2FA-less CN admins and WMF T&S (T418580)]] (duration: 06m 21s)
* 12:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host serpens.wikimedia.org
* 12:10 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
* 12:10 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
* 12:09 mszwarc@deploy2002: mszwarc: Continuing with sync
* 12:09 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1254851{{!}}Enable autodemotion for 2FA-less CN admins and WMF T&S (T418580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:07 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1254851{{!}}Enable autodemotion for 2FA-less CN admins and WMF T&S (T418580)]]
* 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
* 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
* 12:05 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:05 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 12:04 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:03 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:02 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254883{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254884{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]], [[gerrit:1254882{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254881{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]] (duration: 06m 48s)
* 12:02 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1026.eqiad.wmnet with reason: host reimage
* 12:02 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
* 12:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
* 12:01 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
* 11:59 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1026.eqiad.wmnet with reason: host reimage
* 11:58 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 11:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1012.eqiad.wmnet
* 11:57 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1254883{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254884{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]], [[gerrit:1254882{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254881{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]] synced to the testservers (see https://wikitech.wikimedia.
* 11:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 11:56 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1372].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 11:56 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-codfw
* 11:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:55 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1254883{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254884{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]], [[gerrit:1254882{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254881{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]]
* 11:54 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:54 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
* 11:50 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:50 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Updating for dse-k8s-worker1012 - btullis@cumin1003"
* 11:49 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Updating for dse-k8s-worker1012 - btullis@cumin1003"
* 11:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1015.eqiad.wmnet with reason: host reimage
* 11:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
* 11:48 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
* 11:48 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1307.eqiad.wmnet
* 11:48 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1307.eqiad.wmnet
* 11:47 claime: sudo homer lsw1-e5-eqiad* commit 'wikikube-worker1307 to active'
* 11:47 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:46 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1015.eqiad.wmnet with reason: host reimage
* 11:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 11:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 11:44 cgoubert@cumin1003: START - Cookbook sre.dns.netbox
* 11:42 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 11:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
* 11:39 cgoubert@cumin1003: START - Cookbook sre.dns.netbox
* 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
* 11:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2091.codfw.wmnet
* 11:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
* 11:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
* 11:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
* 11:36 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1347.eqiad.wmnet
* 11:34 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1020.eqiad.wmnet with OS bookworm
* 11:31 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 11:31 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2091.codfw.wmnet
* 11:30 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1092.eqiad.wmnet
* 11:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2090.codfw.wmnet
* 11:30 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1347.eqiad.wmnet
* 11:30 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
* 11:30 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 11:30 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 11:29 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 11:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
* 11:29 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 11:28 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 11:28 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
* 11:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
* 11:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2090.codfw.wmnet
* 11:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2089.codfw.wmnet
* 11:23 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1092.eqiad.wmnet
* 11:23 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1091.eqiad.wmnet
* 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
* 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
* 11:20 btullis@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host dse-k8s-worker1015
* 11:20 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1015
* 11:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1091.eqiad.wmnet
* 11:18 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1090.eqiad.wmnet
* 11:18 vgutierrez@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 11:18 vgutierrez@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping2004.codfw.wmnet
* 11:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 11:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2088.codfw.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
* 11:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping2004.codfw.wmnet
* 11:13 vgutierrez@dns1004: END - running authdns-update
* 11:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
* 11:11 vgutierrez@dns1004: START - running authdns-update
* 11:11 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1090.eqiad.wmnet
* 11:11 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1089.eqiad.wmnet
* 11:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2088.codfw.wmnet
* 11:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2087.codfw.wmnet
* 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
* 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
* 11:07 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 11:05 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:05 btullis@cumin1003: END (PASS) - Cookbook sre.hadoop.reboot-workers (exit_code=0) for Hadoop test cluster
* 11:04 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1089.eqiad.wmnet
* 11:04 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:04 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1088.eqiad.wmnet
* 11:03 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2087.codfw.wmnet
* 11:03 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2086.codfw.wmnet
* 11:03 vgutierrez@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
* 11:00 vgutierrez@cumin1003: START - Cookbook sre.dns.netbox
* 10:59 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-jumbo-eqiad
* 10:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
* 10:57 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1088.eqiad.wmnet
* 10:57 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1087.eqiad.wmnet
* 10:57 fabfur@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
* 10:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2086.codfw.wmnet
* 10:56 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2085.codfw.wmnet
* 10:56 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 10:53 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1087.eqiad.wmnet
* 10:53 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1086.eqiad.wmnet
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 10:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
* 10:49 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2085.codfw.wmnet
* 10:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 10:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2084.codfw.wmnet
* 10:46 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1086.eqiad.wmnet
* 10:46 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1085.eqiad.wmnet
* 10:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 10:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
* 10:40 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2084.codfw.wmnet
* 10:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2083.codfw.wmnet
* 10:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 10:39 fabfur@cumin1003: START - Cookbook sre.dns.netbox
* 10:39 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1085.eqiad.wmnet
* 10:39 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1084.eqiad.wmnet
* 10:37 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2001.codfw.wmnet
* 10:34 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:34 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 10:32 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1084.eqiad.wmnet
* 10:32 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1083.eqiad.wmnet
* 10:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2083.codfw.wmnet
* 10:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
* 10:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2082.codfw.wmnet
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 10:32 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
* 10:31 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2002.codfw.wmnet
* 10:26 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-staging2002.codfw.wmnet
* 10:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 10:25 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1083.eqiad.wmnet
* 10:25 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1082.eqiad.wmnet
* 10:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2082.codfw.wmnet
* 10:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2081.codfw.wmnet
* 10:24 btullis@cumin1003: START - Cookbook sre.hadoop.reboot-workers for Hadoop test cluster
* 10:23 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2003.codfw.wmnet
* 10:23 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
* 10:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 10:19 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-staging2003.codfw.wmnet
* 10:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1082.eqiad.wmnet
* 10:17 vgutierrez@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: no reason specified, no task ID specified]
* 10:17 vgutierrez@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: no reason specified, no task ID specified]
* 10:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2081.codfw.wmnet
* 10:17 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2001.codfw.wmnet
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 10:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
* 10:14 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2001.codfw.wmnet
* 10:14 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2002.codfw.wmnet
* 10:13 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1081.eqiad.wmnet
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 10:11 vgutierrez@dns1004: END - running authdns-update
* 10:10 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2002.codfw.wmnet
* 10:10 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2001.codfw.wmnet
* 10:09 vgutierrez@dns1004: START - running authdns-update
* 10:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2080.codfw.wmnet
* 10:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 10:06 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1081.eqiad.wmnet
* 10:06 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1080.eqiad.wmnet
* 10:05 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2001.codfw.wmnet
* 10:05 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2002.codfw.wmnet
* 10:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 10:04 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 10:03 slyngshede@cumin1003: END (FAIL) - Cookbook sre.dns.admin (exit_code=99) DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 10:03 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 10:01 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2002.codfw.wmnet
* 10:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 23 hosts
* 10:01 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
* 10:01 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1002.eqiad.wmnet
* 10:01 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for 23 hosts
* 09:59 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1080.eqiad.wmnet
* 09:59 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1079.eqiad.wmnet
* 09:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2080.codfw.wmnet
* 09:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2079.codfw.wmnet
* 09:58 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1002.eqiad.wmnet
* 09:57 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:56 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
* 09:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 09:52 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:51 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2079.codfw.wmnet
* 09:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2078.codfw.wmnet
* 09:51 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1079.eqiad.wmnet
* 09:51 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1003.eqiad.wmnet
* 09:51 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1078.eqiad.wmnet
* 09:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
* 09:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 09:48 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1003.eqiad.wmnet
* 09:48 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1002.eqiad.wmnet
* 09:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
* 09:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
* 09:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 09:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1026.eqiad.wmnet
* 09:46 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1002.eqiad.wmnet
* 09:46 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1001.eqiad.wmnet
* 09:45 moritzm: installing postgresql-15 security updates
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart A:lvs-secondary-ulsfo and A:liberica
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) pooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2078.codfw.wmnet
* 09:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2077.codfw.wmnet
* 09:45 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin pooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
* 09:44 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.upgrade restart A:lvs-secondary-ulsfo and A:liberica
* 09:44 jayme: switched wikikube staging apiservers to IPIP and maglev in eqiad and codfw - [[phab:T352956|T352956]]
* 09:43 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1078.eqiad.wmnet
* 09:43 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1077.eqiad.wmnet
* 09:42 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1001.eqiad.wmnet
* 09:40 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-staging-master-eqiad@eqiad
* 09:40 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:40 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading A:lvs-secondary-ulsfo and A:liberica ([[phab:T418971|T418971]])
* 09:40 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading A:lvs-secondary-ulsfo and A:liberica ([[phab:T418971|T418971]])
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1026.eqiad.wmnet
* 09:39 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2077.codfw.wmnet
* 09:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2076.codfw.wmnet
* 09:37 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-codfw
* 09:37 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-staging-master-eqiad@eqiad
* 09:36 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1077.eqiad.wmnet
* 09:36 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1076.eqiad.wmnet
* 09:35 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-staging-master-codfw@codfw
* 09:35 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 09:30 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2076.codfw.wmnet
* 09:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2075.codfw.wmnet
* 09:26 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1076.eqiad.wmnet
* 09:26 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1075.eqiad.wmnet
* 09:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 09:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1025.eqiad.wmnet
* 09:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2075.codfw.wmnet
* 09:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2074.codfw.wmnet
* 09:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 09:19 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1075.eqiad.wmnet
* 09:18 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1074.eqiad.wmnet
* 09:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 09:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1025.eqiad.wmnet
* 09:15 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2074.codfw.wmnet
* 09:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2073.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-staging-master-codfw@codfw
* 09:14 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 09:13 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-codfw
* 09:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 09:12 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-eqiad
* 09:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
* 09:10 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1074.eqiad.wmnet
* 09:10 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1073.eqiad.wmnet
* 09:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2073.codfw.wmnet
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet
* 09:08 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 23 hosts with reason: Update ULSFO LVS service IPs
* 09:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 09:03 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1073.eqiad.wmnet
* 09:03 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1072.eqiad.wmnet
* 09:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet
* 09:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2071.codfw.wmnet
* 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 09:02 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping1004.eqiad.wmnet
* 09:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 08:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
* 08:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 08:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping1004.eqiad.wmnet
* 08:56 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1072.eqiad.wmnet
* 08:56 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet
* 08:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2071.codfw.wmnet
* 08:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2070.codfw.wmnet
* 08:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 08:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
* 08:48 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet
* 08:47 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2070.codfw.wmnet
* 08:46 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-eqiad
* 08:29 hashar: Restarting CI Jenkins for plugin upgrade # [[phab:T420347|T420347]]
* 08:22 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 07:45 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 32934
* 07:42 btullis@cumin1003: END (PASS) - Cookbook sre.hadoop.reboot-workers (exit_code=0) for Hadoop analytics cluster
* 07:35 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 32934
* 07:22 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
* 07:16 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
* 06:54 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
* 06:38 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
* 03:22 musikanimal@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254468{{!}}CM5: add more aggressive warnings about CM5 deprecation (T373720)]] (duration: 12m 22s)
* 03:18 musikanimal@deploy2002: musikanimal: Continuing with sync
* 03:11 musikanimal@deploy2002: musikanimal: Backport for [[gerrit:1254468{{!}}CM5: add more aggressive warnings about CM5 deprecation (T373720)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 03:09 musikanimal@deploy2002: Started scap sync-world: Backport for [[gerrit:1254468{{!}}CM5: add more aggressive warnings about CM5 deprecation (T373720)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 47s)
* 02:07 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 02:06 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:04 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:38 denisse@deploy2002: Finished deploy [librenms/librenms@9bdfb73]: Upgrade LibreNMS to 26.3.1 (duration: 00m 19s)
* 01:38 denisse@deploy2002: Started deploy [librenms/librenms@9bdfb73]: Upgrade LibreNMS to 26.3.1
* 01:10 denisse@deploy2002: Finished deploy [librenms/librenms@d152b36]: Upgrade LibreNMS to 25.11.0 (duration: 00m 08s)
* 01:10 denisse@deploy2002: Started deploy [librenms/librenms@d152b36]: Upgrade LibreNMS to 25.11.0
== 2026-03-17 ==
* 23:44 btullis@cumin1003: START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster
* 23:38 btullis@cumin1003: END (FAIL) - Cookbook sre.hadoop.reboot-workers (exit_code=99) for Hadoop analytics cluster
* 22:55 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3081.*
* 22:20 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3073.esams.wmnet [reason: trixie reimaging]
* 22:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3073.esams.wmnet with OS trixie
* 22:11 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3072.esams.wmnet [reason: trixie reimaging]
* 22:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3072.esams.wmnet with OS trixie
* 22:05 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on releases1003.eqiad.wmnet with reason: [[phab:T420246|T420246]]
* 22:05 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T420246|T420246]]
* 21:48 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3073.esams.wmnet with reason: host reimage
* 21:44 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3072.esams.wmnet with reason: host reimage
* 21:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3073.esams.wmnet with reason: host reimage
* 21:39 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3072.esams.wmnet with reason: host reimage
* 21:38 ryankemper: [[phab:T411568|T411568]] Failed back HDFS NameNode from an-master1004 to an-master1003; cluster back to original active/standby configuration
* 21:15 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3073.esams.wmnet with OS trixie
* 21:14 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3073.esams.wmnet [reason: trixie reimaging]
* 21:14 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3072.esams.wmnet with OS trixie
* 21:14 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3072.esams.wmnet [reason: trixie reimaging]
* 21:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3070.esams.wmnet [reason: trixie reimaging]
* 21:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3071.esams.wmnet [reason: trixie reimaging]
* 21:09 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3070.esams.wmnet with OS trixie
* 21:05 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3071.esams.wmnet with OS trixie
* 20:59 alexsanford@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254280{{!}}Remove notice from login form in popup mode (T418534)]] (duration: 07m 32s)
* 20:56 alexsanford@deploy2002: alexsanford: Continuing with sync
* 20:54 alexsanford@deploy2002: alexsanford: Backport for [[gerrit:1254280{{!}}Remove notice from login form in popup mode (T418534)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 alexsanford@deploy2002: Started scap sync-world: Backport for [[gerrit:1254280{{!}}Remove notice from login form in popup mode (T418534)]]
* 20:48 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:43 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:43 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3070.esams.wmnet with reason: host reimage
* 20:40 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 20:40 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 20:38 ryankemper: [[phab:T411568|T411568]] failed over HDFS NameNode from an-master1003 to an-master1004, then rebooted `an-master1003`
* 20:38 ryankemper: [[phab:T411568|T411568]] rebooted `an-coord1003`, `an-coord1004`, `an-tool1007`, `an-tool1008`, `an-tool1011`, `an-web1001`
* 20:38 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3071.esams.wmnet with reason: host reimage
* 20:34 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3070.esams.wmnet with reason: host reimage
* 20:34 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3071.esams.wmnet with reason: host reimage
* 20:31 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254301{{!}}Passwordless login: Don't display conditional auth errors]], [[gerrit:1254302{{!}}Passwordless login: Don't display conditional auth errors]] (duration: 08m 56s)
* 20:27 catrope@deploy2002: catrope: Continuing with sync
* 20:24 catrope@deploy2002: catrope: Backport for [[gerrit:1254301{{!}}Passwordless login: Don't display conditional auth errors]], [[gerrit:1254302{{!}}Passwordless login: Don't display conditional auth errors]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:22 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1254301{{!}}Passwordless login: Don't display conditional auth errors]], [[gerrit:1254302{{!}}Passwordless login: Don't display conditional auth errors]]
* 20:16 ryankemper: [[phab:T411568|T411568]] rebooted `an-test-master1002`, `an-test-master1003`, `an-test-master1004`, `archiva1002`
* 20:12 aude@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251309{{!}}Set wgReadingListsBetaDefaultForNewAccountsAfter for beta cluster (T419163)]] (duration: 08m 53s)
* 20:09 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3071.esams.wmnet with OS trixie
* 20:09 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3070.esams.wmnet with OS trixie
* 20:08 aude@deploy2002: aude: Continuing with sync
* 20:08 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3070.esams.wmnet [reason: trixie reimaging]
* 20:08 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3068.esams.wmnet [reason: trixie reimaging]
* 20:07 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3069.esams.wmnet [reason: trixie reimaging]
* 20:06 aude@deploy2002: aude: Backport for [[gerrit:1251309{{!}}Set wgReadingListsBetaDefaultForNewAccountsAfter for beta cluster (T419163)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:03 aude@deploy2002: Started scap sync-world: Backport for [[gerrit:1251309{{!}}Set wgReadingListsBetaDefaultForNewAccountsAfter for beta cluster (T419163)]]
* 19:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3081.esams.wmnet with OS trixie
* 19:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3069.esams.wmnet with OS trixie
* 19:54 ryankemper: [[phab:T411568|T411568]] rebooted `an-test-client1002`, `an-test-ui1001`, `an-test-coord1001`, `an-test-master1001`
* 19:50 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3068.esams.wmnet with OS trixie
* 19:46 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 19:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast2003.wikimedia.org with OS trixie
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3081.esams.wmnet with reason: host reimage
* 19:28 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3081.esams.wmnet with reason: host reimage
* 19:28 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases1003.eqiad.wmnet with reason: [[phab:T420246|T420246]]
* 19:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3069.esams.wmnet with reason: host reimage
* 19:23 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3068.esams.wmnet with reason: host reimage
* 19:21 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3069.esams.wmnet with reason: host reimage
* 19:20 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3068.esams.wmnet with reason: host reimage
* 19:19 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 19:11 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 19:08 dzahn@dns1004: END - running authdns-update
* 19:07 dzahn@dns1004: START - running authdns-update
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3081.esams.wmnet with OS trixie
* 19:05 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp3081.esams.wmnet with OS trixie
* 19:00 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3080.*
* 18:56 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3069.esams.wmnet with OS trixie
* 18:55 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3069.esams.wmnet [reason: trixie reimaging]
* 18:55 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3068.esams.wmnet with OS trixie
* 18:55 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid public cluster: Reboot Druid nodes
* 18:54 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3068.esams.wmnet [reason: trixie reimaging]
* 18:53 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3066.esams.wmnet [reason: trixie reimaging]
* 18:53 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3067.esams.wmnet [reason: trixie reimaging]
* 18:50 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host bast2003.wikimedia.org with OS trixie
* 18:49 swfrench-wmf: manually uncordoned wikikube-worker-exp1001.eqiad.wmnet after failed reboot
* 18:47 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3080.esams.wmnet with OS trixie
* 18:43 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3067.esams.wmnet with OS trixie
* 18:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3066.esams.wmnet with OS trixie
* 18:32 dwisehaupt@dns1005: END - running authdns-update
* 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast2003.wikimedia.org with OS bookworm
* 18:31 dwisehaupt@dns1005: START - running authdns-update
* 18:25 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 18:25 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 18:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3080.esams.wmnet with reason: host reimage
* 18:19 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[5-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:19 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7008.magru.wmnet
* 18:17 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:16 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3067.esams.wmnet with reason: host reimage
* 18:16 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 18:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3066.esams.wmnet with reason: host reimage
* 18:09 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3080.esams.wmnet with reason: host reimage
* 18:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 18:04 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3067.esams.wmnet with reason: host reimage
* 18:03 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3066.esams.wmnet with reason: host reimage
* 18:03 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 17:52 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1312-1327].eqiad.wmnet,wikikube-worker-exp1001.eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 17:52 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 17:52 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 17:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3080.esams.wmnet with OS trixie
* 17:44 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:43 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:42 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3081.esams.wmnet with OS trixie
* 17:42 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp3081.esams.wmnet with OS trixie
* 17:42 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:41 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid public cluster: Reboot Druid nodes
* 17:40 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:39 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7013.magru.wmnet,cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 17:39 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7016.magru.wmnet
* 17:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7007.magru.wmnet
* 17:37 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:33 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:31 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:29 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3067.esams.wmnet with OS trixie
* 17:29 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3067.esams.wmnet [reason: trixie reimaging]
* 17:28 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3066.esams.wmnet with OS trixie
* 17:28 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:27 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp3066.esams.wmnet with OS trixie
* 17:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3066.esams.wmnet with OS trixie
* 17:26 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:26 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3066.esams.wmnet [reason: trixie reimaging]
* 17:21 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:20 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:19 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3081.esams.wmnet with OS trixie
* 17:19 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:16 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet
* 17:16 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2069.codfw.wmnet
* 17:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 17:14 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:14 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:14 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:13 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:13 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:09 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7014.*
* 17:09 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet
* 17:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2069.codfw.wmnet
* 17:09 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:08 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1070.eqiad.wmnet
* 17:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet
* 17:07 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host bast2003.wikimedia.org with OS bookworm
* 17:06 cgoubert@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1312-1327].eqiad.wmnet,wikikube-worker-exp1001.eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 17:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['bast2003']
* 17:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2068.codfw.wmnet
* 17:02 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1070.eqiad.wmnet
* 17:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2067.codfw.wmnet
* 17:01 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1069.eqiad.wmnet
* 17:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 17:00 cgoubert@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7014.magru.wmnet with OS trixie
* 16:58 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 16:58 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 16:58 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 16:57 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7015.magru.wmnet
* 16:56 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7006.magru.wmnet
* 16:55 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1069.eqiad.wmnet
* 16:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2067.codfw.wmnet
* 16:53 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1068.eqiad.wmnet
* 16:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2066.codfw.wmnet
* 16:47 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:47 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist all cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 16:46 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 16:46 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['bast2003']
* 16:45 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1068.eqiad.wmnet
* 16:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2066.codfw.wmnet
* 16:44 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1067.eqiad.wmnet
* 16:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2065.codfw.wmnet
* 16:42 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid analytics cluster: Reboot Druid nodes
* 16:40 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 16:37 cgoubert@cumin1003: START - Cookbook sre.dns.netbox
* 16:36 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1067.eqiad.wmnet
* 16:36 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2065.codfw.wmnet
* 16:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2064.codfw.wmnet
* 16:35 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1066.eqiad.wmnet
* 16:34 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist group2 cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 16:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2003.codfw.wmnet with OS trixie
* 16:33 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7014.magru.wmnet with reason: host reimage
* 16:32 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:32 cgoubert@cumin1003: START - Cookbook sre.hosts.remove-downtime for wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:28 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7014.magru.wmnet with reason: host reimage
* 16:28 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1066.eqiad.wmnet
* 16:28 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2064.codfw.wmnet
* 16:25 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on releases2003.codfw.wmnet with reason: [[phab:T420246|T420246]]
* 16:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2063.codfw.wmnet
* 16:25 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1065.eqiad.wmnet
* 16:25 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:25 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1065.eqiad.wmnet
* 16:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2063.codfw.wmnet
* 16:18 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-jumbo-eqiad
* 16:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2062.codfw.wmnet
* 16:17 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1064.eqiad.wmnet
* 16:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7013.magru.wmnet
* 16:15 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2003.codfw.wmnet with reason: host reimage
* 16:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7005.magru.wmnet
* 16:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2062.codfw.wmnet
* 16:08 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2003.codfw.wmnet with reason: host reimage
* 16:07 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1064.eqiad.wmnet
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7014.magru.wmnet with OS trixie
* 16:05 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp7014.magru.wmnet with OS trixie
* 16:03 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7013.magru.wmnet,cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 16:03 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[5-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 15:54 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1009.eqiad.wmnet
* 15:54 mutante: zuul2003 - reimaging with trixie
* 15:52 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist group1 cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2033.codfw.wmnet
* 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2033.codfw.wmnet
* 15:46 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2003.codfw.wmnet with OS trixie
* 15:45 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1009.eqiad.wmnet
* 15:45 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1008.eqiad.wmnet
* 15:44 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist group0 cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 15:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2033.codfw.wmnet
* 15:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 15:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2048.codfw.wmnet
* 15:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2048.codfw.wmnet
* 15:36 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1008.eqiad.wmnet
* 15:36 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1007.eqiad.wmnet
* 15:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2048.codfw.wmnet
* 15:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1012.eqiad.wmnet with reason: host reimage
* 15:33 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist testwikis cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 15:32 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid analytics cluster: Reboot Druid nodes
* 15:28 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1007.eqiad.wmnet
* 15:28 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1006.eqiad.wmnet
* 15:27 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1012.eqiad.wmnet with reason: host reimage
* 15:27 samtar@deploy2002: mwscript-k8s job started: cleanupWatchlistLabelMember.php --wiki=testwiki # [[phab:T420328|T420328]]
* 15:27 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2008-dev.codfw.wmnet
* 15:25 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 15:23 jmm@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 15:22 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad
* 15:21 jmm@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 15:20 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2008-dev.codfw.wmnet
* 15:20 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1006.eqiad.wmnet
* 15:20 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1005.eqiad.wmnet
* 15:18 jmm@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 15:18 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244723{{!}}cleanup: Growth: Remove temporary GrowthMentorList overrides (T418518)]] (duration: 06m 32s)
* 15:16 jmm@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 15:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 16509
* 15:14 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2005-dev.codfw.wmnet
* 15:14 urbanecm@deploy2002: urbanecm: Continuing with sync
* 15:13 urbanecm@deploy2002: urbanecm: Backport for [[gerrit:1244723{{!}}cleanup: Growth: Remove temporary GrowthMentorList overrides (T418518)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:13 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 15:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2048.codfw.wmnet
* 15:11 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1244723{{!}}cleanup: Growth: Remove temporary GrowthMentorList overrides (T418518)]]
* 15:10 brennen@deploy2002: Finished deploy [phabricator/deployment@e845707]: deploy phab1004 for [[phab:T420366|T420366]] (duration: 01m 02s)
* 15:09 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1005.eqiad.wmnet
* 15:09 brennen@deploy2002: Started deploy [phabricator/deployment@e845707]: deploy phab1004 for [[phab:T420366|T420366]]
* 15:09 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254217{{!}}Create dblists for wikis where CheckUser and AbuseFilter are disabled (T420063 T420062)]] (duration: 06m 38s)
* 15:09 brennen@deploy2002: Finished deploy [phabricator/deployment@e845707]: deploy phab2002 for [[phab:T420366|T420366]] (duration: 00m 35s)
* 15:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudservices2005-dev.codfw.wmnet
* 15:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 15:08 brennen@deploy2002: Started deploy [phabricator/deployment@e845707]: deploy phab2002 for [[phab:T420366|T420366]]
* 15:08 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2009.codfw.wmnet
* 15:05 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 15:05 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 15:05 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 15:04 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1254217{{!}}Create dblists for wikis where CheckUser and AbuseFilter are disabled (T420063 T420062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:04 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7014.magru.wmnet with OS trixie
* 15:03 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host phab1004.eqiad.wmnet
* 15:02 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1254217{{!}}Create dblists for wikis where CheckUser and AbuseFilter are disabled (T420063 T420062)]]
* 15:02 topranks: reset BGP session to ssw1-d8-eiqad from lsw1-d4-eqiad [[phab:T420180|T420180]]
* 15:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:02 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 15:02 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2003-dev.codfw.wmnet
* 15:00 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2009.codfw.wmnet
* 15:00 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2008.codfw.wmnet
* 14:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
* 14:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
* 14:57 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host phab1004.eqiad.wmnet
* 14:55 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2003-dev.codfw.wmnet
* 14:55 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2002-dev.codfw.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum4004.ulsfo.wmnet
* 14:53 jmm@deploy2002: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 14:53 jmm@deploy2002: helmfile [staging] START helmfile.d/services/thumbor: apply
* 14:52 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2008.codfw.wmnet
* 14:52 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2007.codfw.wmnet
* 14:51 topranks: stop accepting routes on ssw1-d8-eqiad from external peers (cr2-eqiad, other spines) [[phab:T420351|T420351]]
* 14:51 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 14:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host durum4004.ulsfo.wmnet
* 14:50 topranks: stop announcing routes from ssw1-d8-eqiad to external peers (cr2-eqiad, other spines) [[phab:T420351|T420351]]
* 14:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
* 14:48 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2002-dev.codfw.wmnet
* 14:48 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2001-dev.codfw.wmnet
* 14:46 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2007.codfw.wmnet
* 14:46 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2006.codfw.wmnet
* 14:45 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 14:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:44 taavi: deploying cr firewall changes from https://gerrit.wikimedia.org/r/c/operations/homer/public/+/1254211
* 14:44 topranks: stop announcing "direct" routes to ssw1-d8-eqiad from cr2-eqiad [[phab:T420351|T420351]]
* 14:44 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2034.codfw.wmnet
* 14:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:43 moritzm: failover Ganeti master in codfw to ganeti2047
* 14:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2050.codfw.wmnet
* 14:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2050.codfw.wmnet
* 14:41 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2001-dev.codfw.wmnet
* 14:41 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2007-dev.codfw.wmnet
* 14:40 topranks: disabling EVPN IBGP peering from ssw1-d8-eqiad to ssw1-d1-eqiad to stop them reflecting routes [[phab:T420351|T420351]]
* 14:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1006.eqiad.wmnet
* 14:39 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 14:38 inflatador: bking@requestctl remove `wdqs_highest_error_rate_ever_seen` requestctl rule as it is no longer needed
* 14:38 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2006.codfw.wmnet
* 14:37 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2005.codfw.wmnet
* 14:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2050.codfw.wmnet
* 14:35 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2007-dev.codfw.wmnet
* 14:35 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2006-dev.codfw.wmnet
* 14:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1006.eqiad.wmnet
* 14:34 Daimona: Creating ce_event_goals DB table for the CampaignEvents extension in x1.testwiki, x1.test2wiki, x1.officewiki, and x1.wikishared # [[phab:T411433|T411433]]
* 14:34 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2050.codfw.wmnet
* 14:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2049.codfw.wmnet
* 14:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2049.codfw.wmnet
* 14:31 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 14:30 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2005.codfw.wmnet
* 14:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2049.codfw.wmnet
* 14:27 topranks: de-pref internet circuits landing on cr2-eqiad to shift traffic to cr1 [[phab:T420351|T420351]]
* 14:27 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2006-dev.codfw.wmnet
* 14:27 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2005-dev.codfw.wmnet
* 14:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2049.codfw.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2047.codfw.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2047.codfw.wmnet
* 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-test-presto1001.eqiad.wmnet
* 14:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-test-presto1001.eqiad.wmnet
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2047.codfw.wmnet
* 14:19 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2005-dev.codfw.wmnet
* 14:19 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb2004-dev.codfw.wmnet
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 14:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2047.codfw.wmnet
* 14:14 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2046.codfw.wmnet
* 14:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2046.codfw.wmnet
* 14:13 topranks: disable VRRP on cr2-eqiad interfaces facing ssw1-d8-eqiad [[phab:T420351|T420351]]
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 14:11 moritzm: powercycling ganeti2046 (stuck on reboot)
* 14:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
* 14:10 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudlb2004-dev.codfw.wmnet
* 14:10 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb2003-dev.codfw.wmnet
* 14:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 14:05 topranks: setting cr1-eqiad as VRRP master for all vlans [[phab:T420351|T420351]]
* 14:01 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudlb2003-dev.codfw.wmnet
* 14:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2046.codfw.wmnet
* 13:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6002.drmrs.wmnet
* 13:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6002.drmrs.wmnet
* 13:57 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 13:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2046.codfw.wmnet
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2045.codfw.wmnet
* 13:52 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudlb2002-dev.codfw.wmnet
* 13:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6002.drmrs.wmnet
* 13:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2045.codfw.wmnet
* 13:45 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254189{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]], [[gerrit:1254190{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]] (duration: 08m 10s)
* 13:44 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet
* 13:42 esanders@deploy2002: esanders: Continuing with sync
* 13:39 esanders@deploy2002: esanders: Backport for [[gerrit:1254189{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]], [[gerrit:1254190{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:38 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
* 13:38 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2004.codfw.wmnet
* 13:37 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1254189{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]], [[gerrit:1254190{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]]
* 13:35 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on logstash2023.codfw.wmnet with reason: ganeti reboot
* 13:33 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 13:32 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host apus-be2004.codfw.wmnet
* 13:32 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2045.codfw.wmnet
* 13:32 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2002.codfw.wmnet
* 13:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6002.drmrs.wmnet
* 13:30 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251610{{!}}Turn on postprocessing cache for all Parsoid parses (T348255)]] (duration: 10m 31s)
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 13:26 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be2002.codfw.wmnet
* 13:26 cscott@deploy2002: cscott: Continuing with sync
* 13:26 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2001.codfw.wmnet
* 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6001.drmrs.wmnet
* 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6001.drmrs.wmnet
* 13:22 cscott@deploy2002: cscott: Backport for [[gerrit:1251610{{!}}Turn on postprocessing cache for all Parsoid parses (T348255)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2044.codfw.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2044.codfw.wmnet
* 13:20 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be2001.codfw.wmnet
* 13:20 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker13[00-47].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 13:20 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1251610{{!}}Turn on postprocessing cache for all Parsoid parses (T348255)]]
* 13:20 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:19 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:19 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2280-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 13:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6001.drmrs.wmnet
* 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6001.drmrs.wmnet
* 13:16 btullis@cumin1003: END (PASS) - Cookbook sre.presto.reboot-workers (exit_code=0) for Presto an-presto cluster: Reboot Presto nodes
* 13:15 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2044.codfw.wmnet
* 13:15 aklapper@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254166{{!}}Remove misplaced readonly from CategoryViewer::$query (T420315)]] (duration: 06m 31s)
* 13:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint1001.eqiad.wmnet
* 13:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint1001.eqiad.wmnet
* 13:11 aklapper@deploy2002: zabe, aklapper: Continuing with sync
* 13:11 aklapper@deploy2002: zabe, aklapper: Backport for [[gerrit:1254166{{!}}Remove misplaced readonly from CategoryViewer::$query (T420315)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:10 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:10 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet
* 13:10 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 16509
* 13:09 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1004.eqiad.wmnet
* 13:09 aklapper@deploy2002: Started scap sync-world: Backport for [[gerrit:1254166{{!}}Remove misplaced readonly from CategoryViewer::$query (T420315)]]
* 13:08 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint2001.codfw.wmnet
* 13:04 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host apus-be1004.eqiad.wmnet
* 13:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint2001.codfw.wmnet
* 13:02 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2044.codfw.wmnet
* 13:02 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1002.eqiad.wmnet
* 13:01 moritzm: failover Ganeti masters in drmrs to ganeti6003/6004
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6004.drmrs.wmnet
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6004.drmrs.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2043.codfw.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2043.codfw.wmnet
* 12:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 214657
* 12:56 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 214657
* 12:56 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 56308
* 12:55 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 12:55 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 56308
* 12:55 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 28788
* 12:55 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be1002.eqiad.wmnet
* 12:55 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1001.eqiad.wmnet
* 12:54 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 28788
* 12:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6004.drmrs.wmnet
* 12:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2043.codfw.wmnet
* 12:53 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 28788
* 12:53 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 28788
* 12:53 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9269
* 12:52 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1012
* 12:52 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:51 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 9269
* 12:51 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1012
* 12:51 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-e8-eqiad
* 12:51 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-e8-eqiad
* 12:50 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:48 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1015
* 12:48 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be1001.eqiad.wmnet
* 12:45 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1015
* 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6004.drmrs.wmnet
* 12:44 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:44 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2043.codfw.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6003.drmrs.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6003.drmrs.wmnet
* 12:40 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 12:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2042.codfw.wmnet
* 12:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2042.codfw.wmnet
* 12:38 moritzm: powercycling ganeti2042 (stuck on reboot)
* 12:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6003.drmrs.wmnet
* 12:34 moritzm: powercycling ganeti2041 (stuck on reboot)
* 12:31 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6003.drmrs.wmnet
* 12:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install1005.wikimedia.org
* 12:22 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2042.codfw.wmnet
* 12:20 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-cluster
* 12:20 Emperor: roll-reboot apus frontends (codfw) for March reboots
* 12:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install1005.wikimedia.org
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install2005.wikimedia.org
* 12:13 topranks: restart BGP announcements from ssw1-d1-eqiad following change [[phab:T420180|T420180]]
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2042.codfw.wmnet
* 12:08 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2280-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 12:07 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install2005.wikimedia.org
* 12:06 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 12:06 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
* 12:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org
* 12:06 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
* 12:05 jayme@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=(registry1005.eqiad.wmnet{{!}}registry2005.codfw.wmnet)
* 12:05 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
* 12:05 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
* 12:04 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 12:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org
* 12:04 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install4003.wikimedia.org
* 12:03 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c7-eqiad [[phab:T420180|T420180]]
* 12:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2041.codfw.wmnet
* 12:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2041.codfw.wmnet
* 12:01 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
* 12:01 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
* 12:00 jayme@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=(registry1005.eqiad.wmnet{{!}}registry2005.codfw.wmnet)
* 12:00 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c6-eqiad [[phab:T420180|T420180]]
* 12:00 jayme@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=(registry1004.eqiad.wmnet{{!}}registry2004.codfw.wmnet)
* 11:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
* 11:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
* 11:59 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c4-eqiad [[phab:T420180|T420180]]
* 11:58 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c3-eqiad [[phab:T420180|T420180]]
* 11:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install4003.wikimedia.org
* 11:56 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c2-eqiad [[phab:T420180|T420180]]
* 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install5003.wikimedia.org
* 11:55 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
* 11:55 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
* 11:54 jayme@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=(registry1004.eqiad.wmnet{{!}}registry2004.codfw.wmnet)
* 11:54 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-d3-eqiad [[phab:T420180|T420180]]
* 11:53 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-d1-eqiad [[phab:T420180|T420180]]
* 11:52 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 11:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5007.eqsin.wmnet
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5007.eqsin.wmnet
* 11:49 btullis@cumin1003: START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster
* 11:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install5003.wikimedia.org
* 11:48 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org
* 11:47 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 11:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2041.codfw.wmnet
* 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5007.eqsin.wmnet
* 11:43 btullis@cumin1003: START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes
* 11:41 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet
* 11:41 cgoubert@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker13[00-47].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 11:39 topranks: stop accepting external routes on ssw1-d1-eqiad from cr1-eqiad [[phab:T420180|T420180]]
* 11:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org
* 11:33 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-cluster
* 11:33 Emperor: roll-reboot apus frontends (eqiad) for March reboots
* 11:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org
* 11:28 moritzm: failover Ganeti master in eqsin to ganeti5004
* 11:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2041.codfw.wmnet
* 11:24 topranks: reduce local-preference for BGP routes learnt from servers on cr1-eqiad [[phab:T420180|T420180]]
* 11:22 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:18 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2040.codfw.wmnet
* 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2040.codfw.wmnet
* 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2040.codfw.wmnet
* 11:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
* 11:05 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:04 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2040.codfw.wmnet
* 11:01 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5005.eqsin.wmnet
* 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5005.eqsin.wmnet
* 11:00 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:59 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:58 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:58 topranks: prepend external BGP announcements from cr1-eqiad [[phab:T420180|T420180]]
* 10:57 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:56 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:56 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2039.codfw.wmnet
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2039.codfw.wmnet
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5005.eqsin.wmnet
* 10:52 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 10:51 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:49 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:49 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:49 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5005.eqsin.wmnet
* 10:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2039.codfw.wmnet
* 10:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 10:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 10:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2039.codfw.wmnet
* 10:45 javiermonton@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 10:45 javiermonton@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2038.codfw.wmnet
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2038.codfw.wmnet
* 10:43 javiermonton@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 10:43 javiermonton@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
* 10:42 topranks: cease announcing routed networks from ssw1-d1-eqiad to cr1-eqiad in BGP [[phab:T420180|T420180]]
* 10:41 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:41 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:40 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:39 javiermonton@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 10:39 javiermonton@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 10:38 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2038.codfw.wmnet
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 10:37 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:33 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw2004-dev.codfw.wmnet
* 10:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2038.codfw.wmnet
* 10:29 topranks: stop announcing directly connected routes to L3 switches from cr1-eqiad [[phab:T420180|T420180]]
* 10:28 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:27 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudgw2004-dev.codfw.wmnet
* 10:27 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw2003-dev.codfw.wmnet
* 10:26 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:25 topranks: disable EVPN IBGP peering between ssw1-d1-eqiad and ssw1-d8-eqiad [[phab:T420180|T420180]]
* 10:24 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:21 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:21 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:20 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudgw2003-dev.codfw.wmnet
* 10:20 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:19 urbanecm: Delete `job/growthexperiments-listtaskcounts-29513771` from mw-cron (job stuck for more than a month)
* 10:19 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1003.eqiad.wmnet
* 10:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 10:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2037.codfw.wmnet
* 10:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2037.codfw.wmnet
* 10:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1003.eqiad.wmnet
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
* 10:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2037.codfw.wmnet
* 10:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
* 10:06 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2002.codfw.wmnet
* 10:05 topranks: disabling VRRP for et-1/0/5 sub-interfaces on cr1-eqiad [[phab:T420180|T420180]]
* 10:03 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:03 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:02 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:01 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:01 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:01 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:00 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2002.codfw.wmnet
* 10:00 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2001.codfw.wmnet
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:58 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2003.codfw.wmnet
* 09:57 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:57 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 09:57 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 09:56 topranks: shift traffic from codfw to eqiad off Arelion CCT to Lumen
* 09:56 mvernon@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling reboot on A:thanos-fe
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2003.codfw.wmnet
* 09:54 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2001.codfw.wmnet
* 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 09:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2037.codfw.wmnet
* 09:53 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:52 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:50 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:47 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 09:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 09:42 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 09:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2002.codfw.wmnet
* 09:38 moritzm: installing openssl bugfix updates on trixie hosts
* 09:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2002.codfw.wmnet
* 09:31 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2001.codfw.wmnet
* 09:25 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2001.codfw.wmnet
* 09:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
* 09:21 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 09:20 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
* 09:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 09:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 09:10 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254114{{!}}Revert "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] (duration: 12m 36s)
* 09:06 topranks: increase VRRP priority on eqiad vlans on CR2 to shift active gateway to cr2-eqiad [[phab:T420180|T420180]]
* 09:05 mvernon@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling reboot on A:thanos-fe
* 09:03 kharlan@deploy2002: kharlan: Continuing with sync
* 09:02 kharlan@deploy2002: kharlan: Backport for [[gerrit:1254114{{!}}Revert "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:58 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-canary
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2036.codfw.wmnet
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2036.codfw.wmnet
* 08:57 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1254114{{!}}Revert "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]]
* 08:57 moritzm: rebuilt the trixie d-i image for the 13.4 point release [[phab:T420240|T420240]]
* 08:54 kharlan@deploy2002: Sync cancelled.
* 08:52 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-canary
* 08:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2036.codfw.wmnet
* 08:49 kharlan@deploy2002: harroyo-wmf, kharlan: Backport for [[gerrit:1250575{{!}}hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend (T419125)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2036.codfw.wmnet
* 08:44 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host bast2003.wikimedia.org
* 08:43 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1250575{{!}}hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend (T419125)]]
* 08:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:42 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:35 arnaudb@cumin1003: END (PASS) - Cookbook sre.gerrit.restart-gerrit (exit_code=0) Restarting Gerrit on gerrit2002
* 08:34 arnaudb@cumin1003: START - Cookbook sre.gerrit.restart-gerrit Restarting Gerrit on gerrit2002
* 08:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 08:34 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host contint1002.wikimedia.org
* 08:32 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:28 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 08:27 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host contint1002.wikimedia.org
* 08:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2035.codfw.wmnet
* 08:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2035.codfw.wmnet
* 08:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host parsoidtest1001.eqiad.wmnet
* 08:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2035.codfw.wmnet
* 08:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti3005.esams.wmnet to cluster esams03 and group B
* 08:14 moritzm: powercycling bast2003 (stuck on reboot)
* 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti3005.esams.wmnet to cluster esams03 and group B
* 08:14 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host parsoidtest1001.eqiad.wmnet
* 08:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3005.esams.wmnet
* 08:09 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:08 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet
* 07:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
* 07:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5004.wikimedia.org
* 07:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti3005.esams.wmnet with OS bookworm
* 07:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5004.wikimedia.org
* 07:37 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 07:37 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 07:34 arnaudb@cumin1003: END (PASS) - Cookbook sre.gerrit.restart-gerrit (exit_code=0) Restarting Gerrit on gerrit2003
* 07:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2035.codfw.wmnet
* 07:32 arnaudb@cumin1003: START - Cookbook sre.gerrit.restart-gerrit Restarting Gerrit on gerrit2003
* 07:32 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2033.codfw.wmnet
* 07:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 07:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2032.codfw.wmnet
* 07:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2032.codfw.wmnet
* 07:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti3005.esams.wmnet with reason: host reimage
* 07:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti3005.esams.wmnet with reason: host reimage
* 07:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2032.codfw.wmnet
* 07:23 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2032.codfw.wmnet
* 07:00 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti3005.esams.wmnet with OS bookworm
* 06:08 kart_: Updated cxserver to 2026-03-16-071247-production ([[phab:T420004|T420004]])
* 06:07 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 06:06 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 06:05 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 06:04 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 05:58 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 05:58 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 04:41 dwisehaupt@dns1005: END - running authdns-update
* 04:39 dwisehaupt@dns1005: START - running authdns-update
* 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.17 (duration: 01m 17s)
* 03:43 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.20 refs [[phab:T413811|T413811]] (duration: 39m 34s)
* 03:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 10s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 00:26 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6009.*
* 00:25 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6009.drmrs.wmnet with OS trixie
* 00:07 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251158{{!}}Enable languages in main menu on Russian Wikipedia (T419730)]] (duration: 06m 57s)
* 00:03 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 00:02 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1251158{{!}}Enable languages in main menu on Russian Wikipedia (T419730)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1251158{{!}}Enable languages in main menu on Russian Wikipedia (T419730)]]
== 2026-03-16 ==
* 23:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6009.drmrs.wmnet with reason: host reimage
* 23:56 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253604{{!}}Don't output language HTML when no languages present (T419730)]], [[gerrit:1251157{{!}}Support duplication of languages in header and main menu (T419730)]] (duration: 06m 44s)
* 23:56 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6009.drmrs.wmnet with reason: host reimage
* 23:52 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 23:51 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1253604{{!}}Don't output language HTML when no languages present (T419730)]], [[gerrit:1251157{{!}}Support duplication of languages in header and main menu (T419730)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:50 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1253604{{!}}Don't output language HTML when no languages present (T419730)]], [[gerrit:1251157{{!}}Support duplication of languages in header and main menu (T419730)]]
* 23:36 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6009.drmrs.wmnet with OS trixie
* 23:32 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp601(0{{!}}1).*
* 22:54 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6008.drmrs.wmnet [reason: trixie reimaging]
* 22:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6008.drmrs.wmnet with OS trixie
* 22:37 jforrester@deploy2002: Finished scap sync-world: [[phab:T411807|T411807]] (duration: 11m 10s)
* 22:35 jforrester@deploy2002: jforrester: Continuing with sync
* 22:32 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6010.drmrs.wmnet with OS trixie
* 22:31 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp70[09-12].magru.wmnet<nowiki>}</nowiki> and A:cp
* 22:31 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7012.magru.wmnet
* 22:31 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 22:30 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[1-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 22:30 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7004.magru.wmnet
* 22:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6011.drmrs.wmnet with OS trixie
* 22:28 jforrester@deploy2002: jforrester: [[phab:T411807|T411807]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 jforrester@deploy2002: Started scap sync-world: [[phab:T411807|T411807]]
* 22:27 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 22:24 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 22:20 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 22:17 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1020-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 22:07 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6010.drmrs.wmnet with reason: host reimage
* 22:05 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6007.drmrs.wmnet [reason: trixie reimaging]
* 22:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6011.drmrs.wmnet with reason: host reimage
* 22:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6007.drmrs.wmnet with OS trixie
* 22:02 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6008.drmrs.wmnet with OS trixie
* 21:59 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6010.drmrs.wmnet with reason: host reimage
* 21:58 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6011.drmrs.wmnet with reason: host reimage
* 21:58 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp6008.drmrs.wmnet with OS trixie
* 21:52 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7011.magru.wmnet
* 21:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7003.magru.wmnet
* 21:42 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul1003.eqiad.wmnet with OS trixie
* 21:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 21:40 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6010.drmrs.wmnet with OS trixie
* 21:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6012.*
* 21:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6012.drmrs.wmnet with OS trixie
* 21:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6011.drmrs.wmnet with OS trixie
* 21:38 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6007.drmrs.wmnet with reason: host reimage
* 21:36 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6013.*
* 21:36 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 21:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6013.drmrs.wmnet with OS trixie
* 21:32 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6007.drmrs.wmnet with reason: host reimage
* 21:28 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul1003.eqiad.wmnet with reason: host reimage
* 21:22 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul1003.eqiad.wmnet with reason: host reimage
* 21:19 Dreamy_Jazz: Evening UTC backport window done
* 21:18 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251848{{!}}Disable CheckUser on closed wikis where no checks were ever made (T420062)]], [[gerrit:1251865{{!}}Uninstall SecurePoll from closed wikis (T420062)]], [[gerrit:1251888{{!}}DiscussionTools: Uninstall wikis closed before permalinks were deployed (T420052)]] (duration: 06m 10s)
* 21:17 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6008.drmrs.wmnet with OS trixie
* 21:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6008.drmrs.wmnet [reason: trixie reimaging]
* 21:15 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6006.drmrs.wmnet [reason: trixie reimaging]
* 21:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6006.drmrs.wmnet with OS trixie
* 21:14 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 21:14 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1251848{{!}}Disable CheckUser on closed wikis where no checks were ever made (T420062)]], [[gerrit:1251865{{!}}Uninstall SecurePoll from closed wikis (T420062)]], [[gerrit:1251888{{!}}DiscussionTools: Uninstall wikis closed before permalinks were deployed (T420052)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified the
* 21:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6012.drmrs.wmnet with reason: host reimage
* 21:12 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6007.drmrs.wmnet with OS trixie
* 21:12 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1251848{{!}}Disable CheckUser on closed wikis where no checks were ever made (T420062)]], [[gerrit:1251865{{!}}Uninstall SecurePoll from closed wikis (T420062)]], [[gerrit:1251888{{!}}DiscussionTools: Uninstall wikis closed before permalinks were deployed (T420052)]]
* 21:12 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6007.drmrs.wmnet [reason: trixie reimaging]
* 21:11 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6005.drmrs.wmnet [reason: trixie reimaging]
* 21:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6005.drmrs.wmnet with OS trixie
* 21:10 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7010.magru.wmnet
* 21:10 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7002.magru.wmnet
* 21:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6013.drmrs.wmnet with reason: host reimage
* 21:08 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul1003.eqiad.wmnet with OS trixie
* 21:07 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6012.drmrs.wmnet with reason: host reimage
* 21:06 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6013.drmrs.wmnet with reason: host reimage
* 21:05 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253623{{!}}Fix client credentials access tokens (T417278 T419921)]], [[gerrit:1253625{{!}}Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (T414338)]], [[gerrit:1253626{{!}}Configure $wgApiClientErrorSampleRate (T418957)]] (duration: 08m 06s)
* 21:01 catrope@deploy2002: matmarex, catrope: Continuing with sync
* 20:59 catrope@deploy2002: matmarex, catrope: Backport for [[gerrit:1253623{{!}}Fix client credentials access tokens (T417278 T419921)]], [[gerrit:1253625{{!}}Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (T414338)]], [[gerrit:1253626{{!}}Configure $wgApiClientErrorSampleRate (T418957)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:57 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1253623{{!}}Fix client credentials access tokens (T417278 T419921)]], [[gerrit:1253625{{!}}Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (T414338)]], [[gerrit:1253626{{!}}Configure $wgApiClientErrorSampleRate (T418957)]]
* 20:54 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:54 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cp[2027-2040].codfw.wmnet
* 20:50 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2027-2040].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:50 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2027-2040].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage
* 20:48 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6012.drmrs.wmnet with OS trixie
* 20:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6013.drmrs.wmnet with OS trixie
* 20:45 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2042.codfw.wmnet with reason: Testing hosts - not for production
* 20:45 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6005.drmrs.wmnet with reason: host reimage
* 20:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2042.codfw.wmnet with OS trixie
* 20:44 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephmon2007-dev.codfw.wmnet with OS bookworm
* 20:44 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 20:44 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253566{{!}}Configure external link aggregate usage on 12 wikis for top domains (T419837)]] (duration: 06m 59s)
* 20:43 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage
* 20:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2041.codfw.wmnet with reason: Testing hosts - not for production
* 20:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6005.drmrs.wmnet with reason: host reimage
* 20:40 kharlan@deploy2002: kharlan, mszwarc: Continuing with sync
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2041.codfw.wmnet with OS trixie
* 20:39 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 20:38 kharlan@deploy2002: kharlan, mszwarc: Backport for [[gerrit:1253566{{!}}Configure external link aggregate usage on 12 wikis for top domains (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:37 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1253566{{!}}Configure external link aggregate usage on 12 wikis for top domains (T419837)]]
* 20:34 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6014.*
* 20:33 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:33 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:32 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253551{{!}}Fix double post-processing in legacy preview case (T419908)]] (duration: 06m 52s)
* 20:29 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7009.magru.wmnet
* 20:28 cscott@deploy2002: cscott: Continuing with sync
* 20:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7001.magru.wmnet
* 20:27 cscott@deploy2002: cscott: Backport for [[gerrit:1253551{{!}}Fix double post-processing in legacy preview case (T419908)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:26 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1253551{{!}}Fix double post-processing in legacy preview case (T419908)]]
* 20:22 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 20:22 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6006.drmrs.wmnet with OS trixie
* 20:21 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6006.drmrs.wmnet [reason: trixie reimaging]
* 20:21 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6004.drmrs.wmnet [reason: trixie reimaging]
* 20:21 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6005.drmrs.wmnet with OS trixie
* 20:20 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6004.drmrs.wmnet with OS trixie
* 20:20 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6005.drmrs.wmnet [reason: trixie reimaging]
* 20:19 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6003.drmrs.wmnet [reason: trixie reimaging]
* 20:19 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephmon2007-dev.codfw.wmnet with reason: host reimage
* 20:19 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp70[09-12].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:18 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[1-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:17 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248665{{!}}Enable passwordless login in production (T419198)]], [[gerrit:1253572{{!}}Instrument clicks on external links to selected domains (T419837)]] (duration: 06m 43s)
* 20:16 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:15 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 20:15 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephmon2007-dev.codfw.wmnet with reason: host reimage
* 20:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6003.drmrs.wmnet with OS trixie
* 20:13 catrope@deploy2002: kharlan, catrope: Continuing with sync
* 20:12 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:12 catrope@deploy2002: kharlan, catrope: Backport for [[gerrit:1248665{{!}}Enable passwordless login in production (T419198)]], [[gerrit:1253572{{!}}Instrument clicks on external links to selected domains (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:12 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 20:11 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 20:10 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1248665{{!}}Enable passwordless login in production (T419198)]], [[gerrit:1253572{{!}}Instrument clicks on external links to selected domains (T419837)]]
* 20:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6014.drmrs.wmnet with OS trixie
* 20:03 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[2027-2040].codfw.wmnet
* 20:01 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251589{{!}}Uninstall GlobalBlocking from closed wikis (T420062)]] (duration: 08m 20s)
* 19:57 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 19:55 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6004.drmrs.wmnet with reason: host reimage
* 19:55 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephmon2007-dev.codfw.wmnet with OS bookworm
* 19:54 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1251589{{!}}Uninstall GlobalBlocking from closed wikis (T420062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:54 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp2042.codfw.wmnet with OS trixie
* 19:53 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephmon2007-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:53 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS trixie
* 19:52 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1251589{{!}}Uninstall GlobalBlocking from closed wikis (T420062)]]
* 19:52 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host cloudcephmon2007-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:51 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251582{{!}}Uninstall AbuseFilter from closed wikis with no AbuseFilter logs (T420063)]] (duration: 09m 26s)
* 19:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6003.drmrs.wmnet with reason: host reimage
* 19:47 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 19:47 mutante: releases2003 - rm rsync-srv-org-wikimedia-releases-releases2003.* - alerts flapping since server reboot - puppet code needs to be improved to ensure units are removed when primary server is switched ([[phab:T420246|T420246]])
* 19:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6004.drmrs.wmnet with reason: host reimage
* 19:46 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6003.drmrs.wmnet with reason: host reimage
* 19:44 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1251582{{!}}Uninstall AbuseFilter from closed wikis with no AbuseFilter logs (T420063)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6014.drmrs.wmnet with reason: host reimage
* 19:42 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1251582{{!}}Uninstall AbuseFilter from closed wikis with no AbuseFilter logs (T420063)]]
* 19:41 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudcephmon2007-dev
* 19:41 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudcephmon2007-dev
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating cloudcephmon2007-dev in codfw - jhancock@cumin2002"
* 19:40 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6014.drmrs.wmnet with reason: host reimage
* 19:39 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253622{{!}}Revert "Media: Use previous step for non-standard width between steps and original" (T419927)]] (duration: 07m 10s)
* 19:35 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 19:34 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1253622{{!}}Revert "Media: Use previous step for non-standard width between steps and original" (T419927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:32 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating cloudcephmon2007-dev in codfw - jhancock@cumin2002"
* 19:32 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253622{{!}}Revert "Media: Use previous step for non-standard width between steps and original" (T419927)]]
* 19:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp404[5-6].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 19:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet
* 19:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6004.drmrs.wmnet with OS trixie
* 19:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 19:27 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6004.drmrs.wmnet [reason: trixie reimaging]
* 19:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6003.drmrs.wmnet with OS trixie
* 19:26 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6003.drmrs.wmnet [reason: trixie reimaging]
* 19:25 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6002.drmrs.wmnet [reason: trixie reimaging]
* 19:25 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6001.drmrs.wmnet [reason: trixie reimaging]
* 19:21 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6014.drmrs.wmnet with OS trixie
* 19:17 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6002.drmrs.wmnet with OS trixie
* 19:17 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2042.codfw.wmnet with reason: Testing hosts - not for production
* 19:16 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2041.codfw.wmnet with reason: Testing hosts - not for production
* 19:15 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:15 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:12 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6001.drmrs.wmnet with OS trixie
* 19:02 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:02 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 18:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp4046.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:57 cdobbins@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet
* 18:52 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6002.drmrs.wmnet with reason: host reimage
* 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4045.ulsfo.wmnet
* 18:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6001.drmrs.wmnet with reason: host reimage
* 18:47 cdobbins@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp4046.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6002.drmrs.wmnet with reason: host reimage
* 18:45 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6001.drmrs.wmnet with reason: host reimage
* 18:39 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp404[5-6].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:38 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6015.*
* 18:38 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp403[7-9].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4039.ulsfo.wmnet
* 18:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6015.drmrs.wmnet with OS trixie
* 18:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6002.drmrs.wmnet with OS trixie
* 18:26 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6002.drmrs.wmnet [reason: trixie reimaging]
* 18:26 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6001.drmrs.wmnet with OS trixie
* 18:24 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6001.drmrs.wmnet [reason: trixie reimaging]
* 18:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage
* 17:59 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage
* 17:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4038.ulsfo.wmnet
* 17:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6015.drmrs.wmnet with OS trixie
* 17:37 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6016.*
* 17:32 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2042.codfw.wmnet with OS trixie
* 17:18 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4037.ulsfo.wmnet
* 17:08 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 17:06 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp403[7-9].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 17:03 fabfur@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 17:02 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6016.drmrs.wmnet with OS trixie
* 16:57 mutante: contint2002 - rebooting
* 16:47 mutante: phab2002 - rebooting
* 16:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:44 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253520{{!}}Add APCOND_OATH_HAS2FA to UserRequirementsPrivateConditions]] (duration: 06m 15s)
* 16:42 mutante: rebooting backends of releases.wikimedia.org
* 16:42 fabfur@cumin1003: START - Cookbook sre.hosts.reimage for host cp2042.codfw.wmnet with OS trixie
* 16:41 fabfur: reimage cp2042 for HAProxy testing ([[phab:T419825|T419825]])
* 16:41 mszwarc@deploy2002: mszwarc: Continuing with sync
* 16:40 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1253520{{!}}Add APCOND_OATH_HAS2FA to UserRequirementsPrivateConditions]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:39 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2041.codfw.wmnet with OS trixie
* 16:38 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1253520{{!}}Add APCOND_OATH_HAS2FA to UserRequirementsPrivateConditions]]
* 16:37 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1020-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 16:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6016.drmrs.wmnet with reason: host reimage
* 16:32 milimetric: my bad, accidentally merged https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1250249, will read docs on config deployment better
* 16:31 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1012
* 16:27 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1012
* 16:27 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6016.drmrs.wmnet with reason: host reimage
* 16:20 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]] (duration: 07m 28s)
* 16:17 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
* 16:16 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 16:14 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:13 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2005.codfw.wmnet
* 16:12 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]]
* 16:12 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
* 16:11 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
* 16:11 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=codfw
* 16:11 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 16:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
* 16:09 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1024.eqiad.wmnet
* 16:09 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1024.eqiad.wmnet
* 16:09 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1024.eqiad.wmnet
* 16:07 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1004-1007,1011-1012,1015-1016,1019-1021,1029-1031,1034-1168,1240-1289,1291-1327].eqiad.wmnet
* 16:06 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1004-1007,1011-1012,1015-1016,1019-1021,1029-1031,1034-1168,1240-1289,1291-1327].eqiad.wmnet
* 16:06 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6016.drmrs.wmnet with OS trixie
* 16:06 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2005.codfw.wmnet
* 16:06 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
* 16:05 dwisehaupt@dns1006: END - running authdns-update
* 16:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
* 16:05 fabfur@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 16:04 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=codfw
* 16:04 dwisehaupt@dns1006: START - running authdns-update
* 16:04 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=eqiad
* 16:00 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1004-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 15:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
* 15:59 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2031.codfw.wmnet
* 15:59 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti2031.codfw.wmnet
* 15:54 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2004.codfw.wmnet
* 15:53 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
* 15:52 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=eqiad
* 15:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
* 15:47 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2004.codfw.wmnet
* 15:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2031.codfw.wmnet
* 15:47 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
* 15:46 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1024.eqiad.wmnet with reason: Rebooting clouddb1024 [[phab:T419960|T419960]]
* 15:44 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1024.eqiad.wmnet
* 15:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2031.codfw.wmnet
* 15:43 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1023.eqiad.wmnet
* 15:43 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1023.eqiad.wmnet
* 15:43 fabfur@cumin1003: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS trixie
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet
* 15:42 fabfur: reimage cp2041 for HAProxy testing ([[phab:T419825|T419825]])
* 15:42 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:41 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:41 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2003.codfw.wmnet
* 15:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:38 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
* 15:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:37 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Rebooting clouddb1023 [[phab:T419960|T419960]]
* 15:35 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1022.eqiad.wmnet
* 15:35 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1022.eqiad.wmnet
* 15:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet
* 15:32 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2003.codfw.wmnet
* 15:32 dwisehaupt@dns1006: END - running authdns-update
* 15:32 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2002.codfw.wmnet
* 15:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 15:31 dwisehaupt@dns1006: START - running authdns-update
* 15:27 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe-codfw
* 15:26 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
* 15:26 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2029.codfw.wmnet
* 15:26 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti2029.codfw.wmnet
* 15:26 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:26 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:24 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:24 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:22 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2002.codfw.wmnet
* 15:21 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Rebooting clouddb1023 [[phab:T419960|T419960]]
* 15:20 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2001.codfw.wmnet
* 15:20 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]]
* 15:16 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Rebooting clouddb1022 [[phab:T419960|T419960]]
* 15:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet
* 15:11 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling reboot on A:durum and A:durum
* 15:09 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet
* 15:04 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2001.codfw.wmnet
* 15:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:02 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling reboot on A:wikidough
* 15:01 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw1004.eqiad.wmnet
* 15:01 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 15:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2028.codfw.wmnet
* 15:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2028.codfw.wmnet
* 14:56 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:55 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:55 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:54 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]]
* 14:53 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudgw1004.eqiad.wmnet
* 14:53 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-codfw
* 14:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2028.codfw.wmnet
* 14:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2028.codfw.wmnet
* 14:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:50 mvernon@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe-eqiad
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw1003.eqiad.wmnet with OS trixie
* 14:31 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:30 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:26 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:22 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1002-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1002-1003].eqiad.wmnet
* 14:22 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1002-1003].eqiad.wmnet
* 14:21 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1002-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:21 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1002-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:20 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:20 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:20 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:18 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253461{{!}}fix(anon warning): remove wring type=signup param (T415160)]], [[gerrit:1253450{{!}}AccountCreation: track account registrations for WE1.8 experiments (T416100)]] (duration: 09m 16s)
* 14:17 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:17 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:14 sgimeno@deploy2002: sgimeno: Continuing with sync
* 14:13 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:13 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:11 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:11 sgimeno@deploy2002: sgimeno: Backport for [[gerrit:1253461{{!}}fix(anon warning): remove wring type=signup param (T415160)]], [[gerrit:1253450{{!}}AccountCreation: track account registrations for WE1.8 experiments (T416100)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:11 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:10 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1003.eqiad.wmnet with reason: host reimage
* 14:09 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1253461{{!}}fix(anon warning): remove wring type=signup param (T415160)]], [[gerrit:1253450{{!}}AccountCreation: track account registrations for WE1.8 experiments (T416100)]]
* 14:08 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2026.codfw.wmnet
* 14:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 14:08 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:04 arnaudb@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gerrit2002.wikimedia.org with reason: testing
* 14:03 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1003.eqiad.wmnet with reason: host reimage
* 14:02 arnaudb@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 2:00:00 on gerrit2002.wikimedia.org with reason: [[phab:T418256|T418256]]
* 14:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1003.eqiad.wmnet
* 13:58 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 13:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1003.eqiad.wmnet
* 13:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet
* 13:45 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253046{{!}}bowiki: update logos (T419268)]] (duration: 06m 17s)
* 13:45 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudgw1003.eqiad.wmnet with OS trixie
* 13:43 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-codfw
* 13:43 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 13:41 mszwarc@deploy2002: mszwarc, anzx: Continuing with sync
* 13:41 mszwarc@deploy2002: mszwarc, anzx: Backport for [[gerrit:1253046{{!}}bowiki: update logos (T419268)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet
* 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2004.codfw.wmnet
* 13:39 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1253046{{!}}bowiki: update logos (T419268)]]
* 13:38 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253423{{!}}Always use external actor for interwiki rights logs on target wiki (T6055)]] (duration: 08m 53s)
* 13:34 mszwarc@deploy2002: mszwarc: Continuing with sync
* 13:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2004.codfw.wmnet
* 13:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2026.codfw.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2025.codfw.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet
* 13:31 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1253423{{!}}Always use external actor for interwiki rights logs on target wiki (T6055)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 13:29 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1253423{{!}}Always use external actor for interwiki rights logs on target wiki (T6055)]]
* 13:28 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 13:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet
* 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3004.esams.wmnet
* 13:25 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling reboot on A:wikidough
* 13:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 13:22 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling reboot on A:durum and A:durum
* 13:21 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2025.codfw.wmnet
* 13:21 XioNoX: drain edgeuno transit for optic replacement - [[phab:T415743|T415743]]
* 13:19 cgoubert@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host wikikube-ctrl1004.eqiad.wmnet
* 13:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3004.esams.wmnet
* 13:14 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251487{{!}}Replace direct BagOStuff with WANObjectCache (T419666)]] (duration: 11m 25s)
* 13:11 bwojtowicz@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet
* 13:09 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti3005.esams.wmnet
* 13:09 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ganeti3005.esams.wmnet
* 13:07 jforrester@deploy2002: jforrester: Continuing with sync
* 13:06 jforrester@deploy2002: jforrester: Backport for [[gerrit:1251487{{!}}Replace direct BagOStuff with WANObjectCache (T419666)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl1004.eqiad.wmnet
* 13:04 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ncredir4002.ulsfo.wmnet
* 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 13:03 jiji@cumin1003: END (ERROR) - Cookbook sre.memcached.roll-reboot-restart (exit_code=97) rolling reboot on A:memcached-gutter-eqiad
* 13:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 13:03 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1251487{{!}}Replace direct BagOStuff with WANObjectCache (T419666)]]
* 13:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5002.eqsin.wmnet
* 12:51 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl1003.eqiad.wmnet
* 12:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow5002.eqsin.wmnet
* 12:48 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet
* 12:44 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3005.esams.wmnet
* 12:42 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl1003.eqiad.wmnet
* 12:41 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl1002.eqiad.wmnet
* 12:40 aikochou@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet
* 12:37 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ncredir4002.ulsfo.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ncredir4001.ulsfo.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:35 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:34 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow7002.magru.wmnet
* 12:32 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:32 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:32 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:28 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1017
* 12:27 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow7002.magru.wmnet
* 12:27 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1017
* 12:25 aikochou@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:25 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl1002.eqiad.wmnet
* 12:22 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:20 moritzm: failover Ganeti master in esams to ganeti3008
* 12:20 moritzm: failover Ganeti master in esams to ganeti3005
* 12:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:14 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:10 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ncredir4001.ulsfo.wmnet
* 12:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti3006.esams.wmnet
* 12:00 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti3006.esams.wmnet
* 11:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for druid[1009-1013].eqiad.wmnet
* 11:57 btullis@cumin1003: START - Cookbook sre.hosts.remove-downtime for druid[1009-1013].eqiad.wmnet
* 11:57 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.remove-downtime (exit_code=97) for druid[1009-1013].eqiad.wmnet
* 11:57 btullis@cumin1003: START - Cookbook sre.hosts.remove-downtime for druid[1009-1013].eqiad.wmnet
* 11:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1009.eqiad.wmnet with OS bookworm
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3006.esams.wmnet
* 11:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1010.eqiad.wmnet with OS bookworm
* 11:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1011.eqiad.wmnet with OS bookworm
* 11:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1012.eqiad.wmnet with OS bookworm
* 11:30 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3006.esams.wmnet
* 11:29 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1013.eqiad.wmnet with OS bookworm
* 11:24 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:24 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:22 btullis@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on dse-k8s-worker[1012,1015-1017].eqiad.wmnet with reason: Adding 10 Gbps NIC
* 11:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1009.eqiad.wmnet with reason: host reimage
* 11:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1010.eqiad.wmnet with reason: host reimage
* 11:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:14 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:12 mvernon@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe-eqiad
* 11:12 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe-codfw
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1011.eqiad.wmnet with reason: host reimage
* 11:10 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:10 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1012.eqiad.wmnet with reason: host reimage
* 11:07 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit2003.wikimedia.org
* 11:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1013.eqiad.wmnet with reason: host reimage
* 11:06 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 11:06 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 11:04 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1010.eqiad.wmnet with reason: host reimage
* 11:02 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1011.eqiad.wmnet with reason: host reimage
* 11:02 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1009.eqiad.wmnet with reason: host reimage
* 11:01 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1012.eqiad.wmnet with reason: host reimage
* 11:00 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit2003.wikimedia.org
* 10:57 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1013.eqiad.wmnet with reason: host reimage
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2010.codfw.wmnet
* 10:47 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1013.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1012.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1011.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1010.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1009.eqiad.wmnet with OS bookworm
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3007.esams.wmnet
* 10:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2010.codfw.wmnet
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3007.esams.wmnet
* 10:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3007.esams.wmnet
* 10:29 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3007.esams.wmnet
* 10:28 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:28 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3008.esams.wmnet
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3008.esams.wmnet
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2009.codfw.wmnet
* 10:24 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:24 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:23 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2009.codfw.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3008.esams.wmnet
* 10:14 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3008.esams.wmnet
* 10:09 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:08 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2004.codfw.wmnet
* 10:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7001.magru.wmnet
* 10:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet
* 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2004.codfw.wmnet
* 09:56 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet
* 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts tcp-proxy4002.ulsfo.wmnet
* 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 09:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 09:51 elukey@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 09:51 elukey@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 09:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7001.magru.wmnet
* 09:51 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:46 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts tcp-proxy4002.ulsfo.wmnet
* 09:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: decom tcp-proxy4001 - jmm@cumin2002"
* 09:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: decom tcp-proxy4001 - jmm@cumin2002"
* 09:43 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm2001.wikimedia.org
* 09:39 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idm2001.wikimedia.org
* 09:38 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:38 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 09:38 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 09:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:35 slyngshede@dns1004: END - running authdns-update
* 09:34 elukey@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 09:34 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:34 elukey@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 09:33 slyngshede@dns1004: START - running authdns-update
* 09:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:30 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm1001.wikimedia.org
* 09:26 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 09:26 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idm1001.wikimedia.org
* 09:24 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm-test1001.wikimedia.org
* 09:22 moritzm: failover Ganeti master in magru to ganeti7004
* 09:21 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts tcp-proxy4001.ulsfo.wmnet
* 09:21 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-test-eqiad
* 09:20 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idm-test1001.wikimedia.org
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2002.codfw.wmnet
* 09:18 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:15 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM cloudidp2001-dev.codfw.wmnet
* 09:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2002.codfw.wmnet
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7002.magru.wmnet
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet
* 09:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2001.codfw.wmnet
* 09:13 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts tcp-proxy4001.ulsfo.wmnet
* 09:11 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM cloudidp2001-dev.codfw.wmnet
* 09:09 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp2005.wikimedia.org
* 09:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2001.codfw.wmnet
* 09:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet
* 09:05 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp2005.wikimedia.org
* 09:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7002.magru.wmnet
* 09:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet
* 08:59 slyngshede@dns1004: END - running authdns-update
* 08:58 slyngshede@dns1004: START - running authdns-update
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7003.magru.wmnet
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet
* 08:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet
* 08:49 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp1005.wikimedia.org
* 08:48 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:48 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 08:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet
* 08:48 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 08:47 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7003.magru.wmnet
* 08:44 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp1005.wikimedia.org
* 08:44 jmm@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-test-eqiad
* 08:44 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 08:39 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp-test1005.wikimedia.org
* 08:35 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp-test1005.wikimedia.org
* 08:33 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp-test2005.wikimedia.org
* 08:29 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp-test2005.wikimedia.org
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7004.magru.wmnet
* 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7004.magru.wmnet
* 08:22 taavi@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3
* 08:18 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251276{{!}}Fix broken survey links on PersonalDashboard (T419950)]] (duration: 32m 09s)
* 08:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7004.magru.wmnet
* 08:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7004.magru.wmnet
* 08:06 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 08:05 kgraessle@deploy2002: kgraessle: Continuing with sync
* 08:04 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1251276{{!}}Fix broken survey links on PersonalDashboard (T419950)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:59 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 07:52 moritzm: installing Linux 5.10.251 on Bullseye hosts
* 07:45 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1251276{{!}}Fix broken survey links on PersonalDashboard (T419950)]]
* 07:37 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stewards1001.eqiad.wmnet
* 07:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host stewards1001.eqiad.wmnet
* 07:33 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
* 07:26 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
* 07:25 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aphlict1002.eqiad.wmnet
* 07:21 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host aphlict1002.eqiad.wmnet
* 07:10 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host doc2003.codfw.wmnet
* 07:06 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host doc2003.codfw.wmnet
* 07:02 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
* 06:55 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
* 05:25 aikochou@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 52s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-15 ==
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 52s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-14 ==
* 14:16 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251941{{!}}CommonSettings: Set class in $wgCentralAuthRC]] (duration: 06m 17s)
* 14:12 reedy@deploy2002: reedy: Continuing with sync
* 14:11 reedy@deploy2002: reedy: Backport for [[gerrit:1251941{{!}}CommonSettings: Set class in $wgCentralAuthRC]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:10 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1251941{{!}}CommonSettings: Set class in $wgCentralAuthRC]]
* 12:51 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251912{{!}}CommonSettings: Specify class in IRC RCFeed setup]] (duration: 06m 19s)
* 12:47 reedy@deploy2002: reedy, lcawte: Continuing with sync
* 12:46 reedy@deploy2002: reedy, lcawte: Backport for [[gerrit:1251912{{!}}CommonSettings: Specify class in IRC RCFeed setup]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:44 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1251912{{!}}CommonSettings: Specify class in IRC RCFeed setup]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 00s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-13 ==
* 22:52 taavi: taavi@deploy2002 ~ $ mwscript CentralAuth:attachAccount.php --wiki=metawiki --userlist backfiller.txt # unify unified Special:CentralAuth/MediaWikiAccountBackfiller on meta
* 20:07 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 20:01 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 20:01 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 19:55 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4052.*
* 19:54 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 19:54 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS trixie
* 19:53 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2005-dev.codfw.wmnet
* 19:46 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2005-dev.codfw.wmnet
* 19:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 19:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4050.*
* 19:40 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 19:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 19:24 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 19:23 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4050.ulsfo.wmnet
* 19:19 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1035.eqiad.wmnet with OS trixie
* 19:19 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1034.eqiad.wmnet with OS trixie
* 19:18 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:18 jclark@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:18 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:16 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4051.*
* 19:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp4050.ulsfo.wmnet
* 19:14 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4050.ulsfo.wmnet
* 19:13 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:11 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4051.ulsfo.wmnet with OS trixie
* 19:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp4050.ulsfo.wmnet
* 19:02 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1035.eqiad.wmnet with reason: host reimage
* 19:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS bookworm
* 19:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:00 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 18:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1034.eqiad.wmnet with reason: host reimage
* 18:58 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie
* 18:57 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4052.ulsfo.wmnet with OS trixie
* 18:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1035.eqiad.wmnet with reason: host reimage
* 18:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1034.eqiad.wmnet with reason: host reimage
* 18:47 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4051.ulsfo.wmnet with reason: host reimage
* 18:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 18:41 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4051.ulsfo.wmnet with reason: host reimage
* 18:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1035.eqiad.wmnet with OS trixie
* 18:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1034.eqiad.wmnet with OS trixie
* 18:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1033.eqiad.wmnet with OS trixie
* 18:36 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp4050.ulsfo.wmnet with reason: firmware updates
* 18:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie
* 18:24 brett@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp4050.ulsfo.wmnet
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie
* 18:22 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS bookworm
* 18:21 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1374.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 18:21 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4051.ulsfo.wmnet with OS trixie
* 18:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4051.ulsfo.wmnet with OS trixie
* 18:12 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS bookworm
* 18:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1374.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 18:10 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:10 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 18:10 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 18:10 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1253.eqiad.wmnet with reason: Host went down and paged, depooled
* 18:06 cgoubert@cumin1003: dbctl commit (dc=all): 'Depool db1253', diff saved to https://phabricator.wikimedia.org/P89856 and previous config saved to /var/cache/conftool/dbconfig/20260313-180640-cgoubert.json
* 18:06 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 18:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4051.ulsfo.wmnet with OS trixie
* 18:03 elukey: powercycle db1253 - host not reachable via ssh, no events logged in racadm getsel, no console com2 available (blank screen)
* 17:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 17:56 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 17:49 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4049.*
* 17:46 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4049.ulsfo.wmnet with OS trixie
* 17:37 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:37 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:36 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 17:35 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4050.ulsfo.wmnet with OS trixie
* 17:35 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:34 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:27 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 17:26 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:26 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 17:26 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 17:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4049.ulsfo.wmnet with reason: host reimage
* 17:17 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 17:17 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:16 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:16 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4049.ulsfo.wmnet with reason: host reimage
* 17:12 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1016.eqiad.wmnet
* 17:12 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1016.eqiad.wmnet
* 17:11 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1016.eqiad.wmnet
* 17:11 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4048.*
* 17:10 dhinus: (relogging failed sal) conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet
* 17:10 dhinus: (relogging failed sal) DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1016.eqiad.wmnet with reason: Rebooting clouddb1016 [[phab:T419960|T419960]]
* 17:09 dhinus: (relogging failed sal) END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1015.eqiad.wmnet
* 17:08 dhinus: (relogging failed sal) START - Cookbook sre.hosts.remove-downtime for clouddb1015.eqiad.wmnet
* 17:08 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 17:07 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:07 dhinus: fnegri@cumin1003 conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet
* 17:07 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 17:07 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4048.ulsfo.wmnet with OS trixie
* 17:06 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 16:40 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4049.ulsfo.wmnet with OS trixie
* 16:39 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage
* 16:36 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet
* 16:35 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1015.eqiad.wmnet with reason: Rebooting clouddb1015 [[phab:T419960|T419960]]
* 16:34 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1014.eqiad.wmnet
* 16:34 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1014.eqiad.wmnet
* 16:34 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
* 16:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
* 16:28 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudweb1003.wikimedia.org
* 16:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
* 16:22 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudweb1003.wikimedia.org
* 16:21 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudweb1004.wikimedia.org
* 16:20 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1014.eqiad.wmnet with reason: Rebooting clouddb1014 [[phab:T419960|T419960]]
* 16:20 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1014.eqiad.wmnet
* 16:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor-dev2001.codfw.wmnet
* 16:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4048.ulsfo.wmnet with OS trixie
* 16:16 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudweb1004.wikimedia.org
* 16:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor-dev2001.codfw.wmnet
* 16:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor2003.codfw.wmnet
* 16:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet
* 16:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:00 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie
* 15:43 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2010-dev.codfw.wmnet
* 15:38 vgutierrez@cumin1003: END (PASS) - Cookbook sre.loadbalancer.check-ipip (exit_code=0)
* 15:38 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:37 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 15:37 vgutierrez@cumin1003: END (FAIL) - Cookbook sre.loadbalancer.check-ipip (exit_code=99)
* 15:37 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:36 vgutierrez@cumin1003: END (FAIL) - Cookbook sre.loadbalancer.check-ipip (exit_code=99)
* 15:36 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:36 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2010-dev.codfw.wmnet
* 15:35 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet
* 15:35 vgutierrez@cumin1003: END (FAIL) - Cookbook sre.loadbalancer.check-ipip (exit_code=99)
* 15:35 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:28 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
* 15:26 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:25 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:23 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:22 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 15:22 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 15:22 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 15:19 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:19 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:19 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:16 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2006-dev.codfw.wmnet
* 15:12 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudidp2001-dev.codfw.wmnet
* 15:08 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudidp2001-dev.codfw.wmnet
* 15:07 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2006-dev.codfw.wmnet
* 14:58 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
* 14:58 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
* 14:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s3
* 14:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s1
* 14:48 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1015.eqiad.wmnet
* 14:46 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet
* 14:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2003.codfw.wmnet
* 14:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1373.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1015.eqiad.wmnet
* 14:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1034.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:42 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:40 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1023
* 14:40 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1023
* 14:40 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1022
* 14:40 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1022
* 14:40 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1021
* 14:39 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup2004.codfw.wmnet
* 14:39 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1021
* 14:38 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
* 14:37 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1020
* 14:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2003.codfw.wmnet
* 14:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:35 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1020
* 14:35 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T419960|T419960]]
* 14:33 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:33 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
* 14:32 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:32 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:32 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1034.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1373.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:29 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:29 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt - jclark@cumin1003"
* 14:29 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt - jclark@cumin1003"
* 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti-test2002.codfw.wmnet
* 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 14:27 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup2004.codfw.wmnet
* 14:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cuminunpriv1001.eqiad.wmnet
* 14:25 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup2003.codfw.wmnet
* 14:25 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 14:25 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 14:24 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 14:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cuminunpriv1001.eqiad.wmnet
* 14:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 14:22 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1004.eqiad.wmnet
* 14:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti-test2002.codfw.wmnet
* 14:14 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup2003.codfw.wmnet
* 14:13 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup1004.eqiad.wmnet
* 14:09 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1003.eqiad.wmnet
* 14:01 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup1003.eqiad.wmnet
* 13:59 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit1003.wikimedia.org
* 13:53 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit1003.wikimedia.org
* 13:49 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists2001.wikimedia.org
* 13:48 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1004.eqiad.wmnet
* 13:46 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host etherpad1004.eqiad.wmnet
* 13:45 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:45 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:44 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1004.eqiad.wmnet
* 13:42 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists2001.wikimedia.org
* 13:42 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host etherpad1004.eqiad.wmnet
* 13:37 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host etherpad2002.codfw.wmnet
* 13:36 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit2002.wikimedia.org
* 13:33 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host etherpad2002.codfw.wmnet
* 13:32 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2003.wikimedia.org
* 13:30 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit2002.wikimedia.org
* 13:26 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2003.wikimedia.org
* 13:26 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 13:24 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2020.codfw.wmnet
* 13:23 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2019.codfw.wmnet
* 13:19 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 13:19 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org
* 13:13 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2020.codfw.wmnet
* 13:13 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org
* 13:12 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2019.codfw.wmnet
* 13:11 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.reboot-runner (exit_code=0) rolling reboot on A:gitlab-runner
* 13:05 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2018.codfw.wmnet
* 13:05 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1020.eqiad.wmnet
* 12:54 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2018.codfw.wmnet
* 12:54 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1020.eqiad.wmnet
* 12:54 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2017.codfw.wmnet
* 12:54 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1019.eqiad.wmnet
* 12:53 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:50 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:50 moritzm: powercycle pki1002
* 12:48 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:47 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:44 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:44 mutante: rebooted phab1005 - waiting for it to come back
* 12:44 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2017.codfw.wmnet
* 12:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1019.eqiad.wmnet
* 12:42 bwojtowicz@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:40 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1018.eqiad.wmnet
* 12:39 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2016.codfw.wmnet
* 12:31 jelto@cumin1003: START - Cookbook sre.gitlab.reboot-runner rolling reboot on A:gitlab-runner
* 12:29 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1018.eqiad.wmnet
* 12:29 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1017.eqiad.wmnet
* 12:28 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2016.codfw.wmnet
* 12:27 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2015.codfw.wmnet
* 12:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast1004.wikimedia.org
* 12:18 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host doc1004.eqiad.wmnet
* 12:18 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1017.eqiad.wmnet
* 12:17 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:17 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:15 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2015.codfw.wmnet
* 12:15 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:15 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:14 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host doc1004.eqiad.wmnet
* 12:13 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aphlict2001.codfw.wmnet
* 12:10 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host aphlict2001.codfw.wmnet
* 12:10 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: reboot
* 12:10 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts2002.codfw.wmnet
* 12:07 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:07 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:03 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts2002.codfw.wmnet
* 12:02 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:02 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:01 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1016.eqiad.wmnet
* 12:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1019.eqiad.wmnet
* 11:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1018.eqiad.wmnet
* 11:59 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:59 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1019.eqiad.wmnet
* 11:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1018.eqiad.wmnet
* 11:51 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:51 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:50 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1016.eqiad.wmnet
* 11:49 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup2004.codfw.wmnet
* 11:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup2004.codfw.wmnet
* 11:43 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup1004.eqiad.wmnet
* 11:37 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup1004.eqiad.wmnet
* 11:36 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup2003.codfw.wmnet
* 11:34 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup1003.eqiad.wmnet
* 11:32 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 11:32 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:30 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup2003.codfw.wmnet
* 11:28 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup1003.eqiad.wmnet
* 11:27 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:26 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy1001.eqiad.wmnet
* 11:21 arnaudb@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host contint1003.wikimedia.org
* 11:21 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy1001.eqiad.wmnet
* 11:21 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy1002.eqiad.wmnet
* 11:16 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy1002.eqiad.wmnet
* 11:16 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy2001.codfw.wmnet
* 11:16 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host contint1003.wikimedia.org
* 11:12 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-master-codfw
* 11:12 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul1001.eqiad.wmnet
* 11:11 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy2001.codfw.wmnet
* 11:11 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy2002.codfw.wmnet
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1018.eqiad.wmnet with reason: host reimage
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:09 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-master-eqiad
* 11:08 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:08 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul1001.eqiad.wmnet
* 11:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1008-dev.eqiad.wmnet
* 11:07 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy2002.codfw.wmnet
* 11:06 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy3001.esams.wmnet
* 11:05 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1018.eqiad.wmnet with reason: host reimage
* 11:01 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy3001.esams.wmnet
* 11:01 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1008-dev.eqiad.wmnet
* 11:01 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1002-dev.eqiad.wmnet
* 11:01 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy3002.esams.wmnet
* 10:59 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 22:00:00 on db1258.eqiad.wmnet with reason: depooled, likely to flap over the weekend
* 10:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudbackup1002-dev.eqiad.wmnet
* 10:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1001-dev.eqiad.wmnet
* 10:56 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy3002.esams.wmnet
* 10:56 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-master-codfw
* 10:55 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4001.ulsfo.wmnet
* 10:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-worker-codfw
* 10:54 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudbackup1001-dev.eqiad.wmnet
* 10:52 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-master-eqiad
* 10:50 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-worker-eqiad
* 10:50 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 10:50 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4001.ulsfo.wmnet
* 10:50 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4002.ulsfo.wmnet
* 10:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1019.eqiad.wmnet with reason: host reimage
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1019.eqiad.wmnet with reason: host reimage
* 10:45 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4002.ulsfo.wmnet
* 10:45 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy5001.eqsin.wmnet
* 10:40 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy5001.eqsin.wmnet
* 10:39 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy5002.eqsin.wmnet
* 10:37 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul2001.codfw.wmnet
* 10:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool', diff saved to https://phabricator.wikimedia.org/P89852 and previous config saved to /var/cache/conftool/dbconfig/20260313-103719-ladsgroup.json
* 10:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul2001.codfw.wmnet
* 10:32 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy5002.eqsin.wmnet
* 10:31 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudweb2002-dev.wikimedia.org
* 10:31 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul1002.eqiad.wmnet
* 10:31 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy6001.drmrs.wmnet
* 10:29 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:29 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:28 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:28 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 10:27 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 10:27 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:27 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul1002.eqiad.wmnet
* 10:27 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:26 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy6001.drmrs.wmnet
* 10:24 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudweb2002-dev.wikimedia.org
* 10:23 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy6002.drmrs.wmnet
* 10:22 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul2002.codfw.wmnet
* 10:19 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1008.eqiad.wmnet
* 10:18 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul2002.codfw.wmnet
* 10:18 arnaudb@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host zuul2002.codfw.wmnet
* 10:18 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul2002.codfw.wmnet
* 10:18 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy6002.drmrs.wmnet
* 10:16 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy7002.magru.wmnet
* 10:16 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-worker-eqiad
* 10:15 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-worker-codfw
* 10:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1008.eqiad.wmnet
* 10:13 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1007.eqiad.wmnet
* 10:12 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy7002.magru.wmnet
* 10:09 jelto@cumin1003: conftool action : set/pooled=yes; selector: name=tcp-proxy7001.magru.wmnet
* 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1007.eqiad.wmnet
* 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1006.eqiad.wmnet
* 10:07 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy7001.magru.wmnet
* 10:03 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy7001.magru.wmnet
* 10:02 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1006.eqiad.wmnet
* 10:02 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1005.eqiad.wmnet
* 10:01 jelto@cumin1003: conftool action : set/pooled=no; selector: name=tcp-proxy7001.magru.wmnet
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 09:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1005.eqiad.wmnet
* 09:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1004.eqiad.wmnet
* 09:51 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1004.eqiad.wmnet
* 09:51 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1003.eqiad.wmnet
* 09:50 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:50 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:46 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1003.eqiad.wmnet
* 09:46 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1002.eqiad.wmnet
* 09:41 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1002.eqiad.wmnet
* 09:40 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1001.eqiad.wmnet
* 09:39 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:39 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:35 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1001.eqiad.wmnet
* 09:35 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-ctrl1002.eqiad.wmnet
* 09:34 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-ctrl1001.eqiad.wmnet
* 09:34 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:33 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:32 moritzm: installing Linux 6.1.164 on Bookworm hosts
* 09:30 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-ctrl1002.eqiad.wmnet
* 09:28 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-ctrl1001.eqiad.wmnet
* 09:01 elukey@cumin1003: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 08:37 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 07:56 moritzm: installing Linux 6.12.74 on Trixie hosts
* 07:55 moritzm: installing 6.12.74 on Trixie hosts
* 02:57 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4044.ulsfo.wmnet [reason: trixie reimaging]
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 18s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:41 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4044.ulsfo.wmnet with OS trixie
* 01:37 mutante: contint1003/contint2003 - every time(?) we setup machines with puppet using our httpd module and PHP - and puppet runs for the first time we run into the same old issue with "Exec[ensure_present_mod_php" failing and "Considering conflict mpm_worker for mpm_prefork"sudo a2dismod mpm_event". The fix is: 'sudo a2dismod mpm_event' and run puppet again. [[phab:T418521|T418521]]
* 01:26 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on contint1003.wikimedia.org with reason: [[phab:T418521|T418521]]
* 01:26 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on contint2003.wikimedia.org with reason: [[phab:T418521|T418521]]
* 01:23 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on contint2003.wikimedia.org with reason: setup
* 01:22 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on contint1003.wikimedia.org with reason: setup
* 01:22 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4047.*
* 01:09 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 01:08 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4043.ulsfo.wmnet [reason: trixie reimaging]
* 01:06 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 01:05 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4043.ulsfo.wmnet with OS trixie
* 00:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4047.ulsfo.wmnet with OS trixie
* 00:45 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4044.ulsfo.wmnet with OS trixie
* 00:45 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4044.ulsfo.wmnet [reason: trixie reimaging]
* 00:42 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4042.ulsfo.wmnet [reason: trixie reimaging]
* 00:41 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4042.ulsfo.wmnet with OS trixie
* 00:39 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4043.ulsfo.wmnet with reason: host reimage
* 00:31 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4043.ulsfo.wmnet with reason: host reimage
* 00:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4047.ulsfo.wmnet with reason: host reimage
* 00:27 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1251187 [[phab:T419637|T419637]] (duration: 07m 12s)
* 00:23 rzl@deploy2002: rzl: Continuing with sync
* 00:23 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4047.ulsfo.wmnet with reason: host reimage
* 00:22 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1251187 [[phab:T419637|T419637]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:21 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1251187 [[phab:T419637|T419637]]
* 00:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 00:14 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4040.ulsfo.wmnet [reason: trixie reimaging]
* 00:11 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 00:11 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4043.ulsfo.wmnet with OS trixie
* 00:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4040.ulsfo.wmnet with OS trixie
* 00:04 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4047.ulsfo.wmnet with OS trixie
* 00:03 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4047.ulsfo.wmnet with OS trixie
* 00:03 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4043.ulsfo.wmnet with OS trixie
== 2026-03-12 ==
* 23:57 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host o11ytest1001.eqiad.wmnet with OS trixie
* 23:53 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 23:53 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 23:50 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie
* 23:49 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 23:49 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 23:45 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 23:45 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 23:45 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 23:44 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4042.ulsfo.wmnet with OS trixie
* 23:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 23:41 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 23:41 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 23:40 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on o11ytest1001.eqiad.wmnet with reason: host reimage
* 23:36 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 23:36 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on o11ytest1001.eqiad.wmnet with reason: host reimage
* 23:36 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 23:35 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 23:35 rzl@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 23:22 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host o11ytest1001
* 23:22 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest1001
* 23:21 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie
* 23:19 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4040.ulsfo.wmnet with OS trixie
* 23:18 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest1001
* 23:18 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest1001.eqiad.wmnet 141.32.64.10.in-addr.arpa 1.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 23:18 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest1001.eqiad.wmnet 141.32.64.10.in-addr.arpa 1.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 23:18 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:18 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest1001 - herron@cumin1003"
* 23:18 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest1001 - herron@cumin1003"
* 23:04 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4047.ulsfo.wmnet with OS trixie
* 23:00 herron@cumin1003: START - Cookbook sre.dns.netbox
* 23:00 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host o11ytest1001
* 22:59 herron@cumin1003: START - Cookbook sre.hosts.reimage for host o11ytest1001.eqiad.wmnet with OS trixie
* 22:58 herron@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mwlog1002 to o11ytest1001
* 22:57 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest1001
* 22:55 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest1001
* 22:55 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest1001 on all recursors
* 22:55 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest1001 on all recursors
* 22:55 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:55 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog1002 to o11ytest1001 - herron@cumin1003"
* 22:54 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog1002 to o11ytest1001 - herron@cumin1003"
* 22:51 herron@cumin1003: START - Cookbook sre.dns.netbox
* 22:50 herron@cumin1003: START - Cookbook sre.hosts.rename from mwlog1002 to o11ytest1001
* 22:42 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4043.ulsfo.wmnet with OS trixie
* 22:42 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4043.ulsfo.wmnet [reason: trixie reimaging]
* 22:41 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4041.ulsfo.wmnet [reason: trixie reimaging]
* 22:39 bvibber@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251190{{!}}Enable ReaderExperiments Share Highlight subfeature for metrics (T416945)]], [[gerrit:1251195{{!}}Metrics module for share highlight experiment baseline (T416945)]] (duration: 06m 49s)
* 22:38 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4041.ulsfo.wmnet with OS trixie
* 22:35 bvibber@deploy2002: bvibber: Continuing with sync
* 22:34 bvibber@deploy2002: bvibber: Backport for [[gerrit:1251190{{!}}Enable ReaderExperiments Share Highlight subfeature for metrics (T416945)]], [[gerrit:1251195{{!}}Metrics module for share highlight experiment baseline (T416945)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:32 bvibber@deploy2002: Started scap sync-world: Backport for [[gerrit:1251190{{!}}Enable ReaderExperiments Share Highlight subfeature for metrics (T416945)]], [[gerrit:1251195{{!}}Metrics module for share highlight experiment baseline (T416945)]]
* 22:28 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1251182 [[phab:T419637|T419637]] (duration: 11m 18s)
* 22:27 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host o11ytest2001.codfw.wmnet with OS trixie
* 22:26 rzl@deploy2002: rzl: Continuing with sync
* 22:24 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1251182 [[phab:T419637|T419637]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:23 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie
* 22:23 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4042.ulsfo.wmnet [reason: trixie reimaging]
* 22:20 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4046.*
* 22:17 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1251182 [[phab:T419637|T419637]]
* 22:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4041.ulsfo.wmnet with reason: host reimage
* 22:09 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on o11ytest2001.codfw.wmnet with reason: host reimage
* 22:08 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4041.ulsfo.wmnet with reason: host reimage
* 22:03 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on o11ytest2001.codfw.wmnet with reason: host reimage
* 22:01 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:58 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie
* 21:58 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4040.ulsfo.wmnet [reason: trixie reimaging]
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host o11ytest2001
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest2001
* 21:45 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest2001
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest2001.codfw.wmnet 9.32.192.10.in-addr.arpa 9.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:45 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest2001.codfw.wmnet 9.32.192.10.in-addr.arpa 9.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest2001 - herron@cumin1003"
* 21:45 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest2001 - herron@cumin1003"
* 21:43 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4041.ulsfo.wmnet with OS trixie
* 21:41 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4038.ulsfo.wmnet [reason: trixie reimaging]
* 21:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4038.ulsfo.wmnet with OS trixie
* 21:39 herron@cumin1003: START - Cookbook sre.dns.netbox
* 21:39 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host o11ytest2001
* 21:39 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:39 herron@cumin1003: START - Cookbook sre.hosts.reimage for host o11ytest2001.codfw.wmnet with OS trixie
* 21:36 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:35 herron@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mwlog2002 to o11ytest2001
* 21:35 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:35 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:35 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest2001
* 21:34 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:34 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:33 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:32 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest2001
* 21:32 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest2001 on all recursors
* 21:32 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest2001 on all recursors
* 21:32 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:32 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog2002 to o11ytest2001 - herron@cumin1003"
* 21:31 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog2002 to o11ytest2001 - herron@cumin1003"
* 21:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4046.ulsfo.wmnet with OS trixie
* 21:27 herron@cumin1003: START - Cookbook sre.dns.netbox
* 21:26 herron@cumin1003: START - Cookbook sre.hosts.rename from mwlog2002 to o11ytest2001
* 21:20 rzl: rzl@apt1002:~$ sudo -i reprepro copy trixie-wikimedia bullseye-wikimedia envoyproxy
* 21:20 rzl: rzl@apt1002:~$ sudo -i reprepro copy bookworm-wikimedia bullseye-wikimedia envoyproxy
* 21:20 rzl: rzl@apt1002:~$ sudo -i reprepro -C main includedeb bullseye-wikimedia /srv/wikimedia/pool/component/envoy-future/e/envoyproxy/envoyproxy_1.35.9-1_amd64.deb
* 21:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 21:13 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251173{{!}}Revert "Move post-processing of flaggedrevs views inside FlaggablePageView"]] (duration: 07m 28s)
* 21:09 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 21:09 cscott@deploy2002: cscott: Continuing with sync
* 21:07 cscott@deploy2002: cscott: Backport for [[gerrit:1251173{{!}}Revert "Move post-processing of flaggedrevs views inside FlaggablePageView"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:05 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1251173{{!}}Revert "Move post-processing of flaggedrevs views inside FlaggablePageView"]]
* 21:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 21:02 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251152{{!}}Use 'alwaysShowLogin' query parameter during login (T419723)]], [[gerrit:1251150{{!}}login: Add 'alwaysShowLogin' login URL parameter (T419723)]], [[gerrit:1251168{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1250750{{!}}Enable parser survey for opted-out users on ru/pt/ja/id wikis (T414852)]] (duration: 10m 41s)
* 20:58 tgr@deploy2002: tgr, jsn, cscott: Continuing with sync
* 20:58 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 20:54 tgr@deploy2002: tgr, jsn, cscott: Backport for [[gerrit:1251152{{!}}Use 'alwaysShowLogin' query parameter during login (T419723)]], [[gerrit:1251150{{!}}login: Add 'alwaysShowLogin' login URL parameter (T419723)]], [[gerrit:1251168{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1250750{{!}}Enable parser survey for opted-out users on ru/pt/ja/id wikis (T414852)]] synced to the testservers (see https://wikitech
* 20:52 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1251152{{!}}Use 'alwaysShowLogin' query parameter during login (T419723)]], [[gerrit:1251150{{!}}login: Add 'alwaysShowLogin' login URL parameter (T419723)]], [[gerrit:1251168{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1250750{{!}}Enable parser survey for opted-out users on ru/pt/ja/id wikis (T414852)]]
* 20:49 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 20:43 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251087{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251088{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251106{{!}}phpunit: Avoid unnecessary writes in generatePHPUnitConfig.php (T419107)]] (duration: 07m 37s)
* 20:39 tgr@deploy2002: tgr, daimona: Continuing with sync
* 20:37 tgr@deploy2002: tgr, daimona: Backport for [[gerrit:1251087{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251088{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251106{{!}}phpunit: Avoid unnecessary writes in generatePHPUnitConfig.php (T419107)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:37 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4046.ulsfo.wmnet with OS trixie
* 20:35 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1251087{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251088{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251106{{!}}phpunit: Avoid unnecessary writes in generatePHPUnitConfig.php (T419107)]]
* 20:35 jsn@deploy2002: Synchronized wmf-config/throttle.php: (no justification provided) (duration: 01m 57s)
* 20:32 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4045.*
* 20:28 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4041.ulsfo.wmnet with OS trixie
* 20:20 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 20:18 jsn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249364{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1251140{{!}}[arwikiquote] add namespace alias for NS_PROJECT (T419828)]], [[gerrit:1251098{{!}}Deploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1251164{{!}}Increase IP cap limit for azwiki (T419899)]] (duration: 11m 11s)
* 20:14 jsn@deploy2002: jsn, dani, nmw03, gergesshamon: Continuing with sync
* 20:09 jsn@deploy2002: jsn, dani, nmw03, gergesshamon: Backport for [[gerrit:1249364{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1251140{{!}}[arwikiquote] add namespace alias for NS_PROJECT (T419828)]], [[gerrit:1251098{{!}}Deploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1251164{{!}}Increase IP cap limit for azwiki (T419899)]] synced to the testservers (see https://wikitech.wikimedia.org/wik
* 20:07 jsn@deploy2002: Started scap sync-world: Backport for [[gerrit:1249364{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1251140{{!}}[arwikiquote] add namespace alias for NS_PROJECT (T419828)]], [[gerrit:1251098{{!}}Deploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1251164{{!}}Increase IP cap limit for azwiki (T419899)]]
* 19:21 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
* 19:21 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mathoid: apply
* 19:20 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
* 19:19 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mathoid: apply
* 19:16 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 19:16 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 19:15 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 19:14 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 19:13 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 19:12 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 19:12 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 19:11 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 19:07 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4041.ulsfo.wmnet with OS trixie
* 19:06 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4041.ulsfo.wmnet [reason: trixie reimaging]
* 19:06 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4039.ulsfo.wmnet [reason: trixie reimaging]
* 19:06 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 brennen@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251138{{!}}EditPage: Re-add catch block for MWException (T419883)]] (duration: 09m 46s)
* 19:04 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:04 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:04 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:04 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:03 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:03 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:01 brennen@deploy2002: somerandomdeveloper, brennen: Continuing with sync
* 18:59 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 18:57 brennen@deploy2002: somerandomdeveloper, brennen: Backport for [[gerrit:1251138{{!}}EditPage: Re-add catch block for MWException (T419883)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:55 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4039.ulsfo.wmnet with OS trixie
* 18:55 brennen@deploy2002: Started scap sync-world: Backport for [[gerrit:1251138{{!}}EditPage: Re-add catch block for MWException (T419883)]]
* 18:52 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mathoid: apply
* 18:52 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mathoid: apply
* 18:42 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp20(2[789]{{!}}3[0-9]{{!}}40).*,service=ats-be
* 18:34 brennen@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 18:29 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4039.ulsfo.wmnet with reason: host reimage
* 18:29 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:28 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Updating dse-k8s-worker1019 - btullis@cumin1003"
* 18:26 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2332.codfw.wmnet
* 18:26 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2332.codfw.wmnet
* 18:25 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Updating dse-k8s-worker1019 - btullis@cumin1003"
* 18:24 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4039.ulsfo.wmnet with reason: host reimage
* 18:23 brennen@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251139{{!}}Ensure that we always run ParserHooks::transformHtml() when using Parsoid (T419830)]] (duration: 14m 46s)
* 18:21 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 18:20 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4038.ulsfo.wmnet with OS trixie
* 18:19 brennen@deploy2002: cscott, brennen: Continuing with sync
* 18:18 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 18:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4045.ulsfo.wmnet with OS trixie
* 18:10 brennen@deploy2002: cscott, brennen: Backport for [[gerrit:1251139{{!}}Ensure that we always run ParserHooks::transformHtml() when using Parsoid (T419830)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:10 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1019.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:08 brennen@deploy2002: Started scap sync-world: Backport for [[gerrit:1251139{{!}}Ensure that we always run ParserHooks::transformHtml() when using Parsoid (T419830)]]
* 18:02 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1019.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:02 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 17:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4039.ulsfo.wmnet with OS trixie
* 17:58 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1019
* 17:58 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1019
* 17:56 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4039.ulsfo.wmnet [reason: trixie reimaging]
* 17:55 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp20(3[6-9]{{!}}4[012]).*
* 17:54 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:53 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet [reason: trixie reimaging]
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4045.ulsfo.wmnet with reason: host reimage
* 17:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4037.ulsfo.wmnet with OS trixie
* 17:49 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4045.ulsfo.wmnet with reason: host reimage
* 17:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:33 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:31 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1018
* 17:31 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1018
* 17:30 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:28 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS trixie
* 17:28 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4045.ulsfo.wmnet with OS trixie
* 17:27 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp203[0-5].*
* 17:24 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4037.ulsfo.wmnet with reason: host reimage
* 17:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:20 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:18 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4037.ulsfo.wmnet with reason: host reimage
* 17:17 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-backup1004.eqiad.wmnet with OS trixie
* 17:17 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 17:16 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 17:06 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp202[89].*
* 17:03 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp2027.*
* 16:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 16:58 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4038.ulsfo.wmnet [reason: trixie reimaging]
* 16:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup1004.eqiad.wmnet with reason: host reimage
* 16:58 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:57 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4037.ulsfo.wmnet with OS trixie
* 16:57 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet [reason: trixie reimaging]
* 16:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup1004.eqiad.wmnet with reason: host reimage
* 16:50 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:45 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 16:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 16:44 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 16:43 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 16:43 swfrench-wmf: reprepro include dh-php_5.5+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:42 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 16:41 swfrench-wmf: reprepro include php-defaults_94+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:37 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-backup1004.eqiad.wmnet with OS trixie
* 16:36 swfrench-wmf: reprepro include php8.3_8.3.30-1+wmf11u2+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:27 dzahn@dns1004: END - running authdns-update
* 16:26 dzahn@dns1004: START - running authdns-update
* 16:25 mutante: switching old status.wikimedia.org page away from rackspace [[phab:T414098|T414098]]
* 16:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS trixie
* 16:20 dzahn@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 16:20 dzahn@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 16:19 dzahn@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 16:19 dzahn@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 16:12 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 16:11 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 16:10 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 16:09 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 16:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 16:09 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 16:08 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 16:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 16:07 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 16:06 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 16:05 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 16:04 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 16:04 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 16:04 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 16:03 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 16:02 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 16:02 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 16:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 16:01 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 15:58 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 15:57 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 15:57 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 15:56 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 15:52 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudgw2002-dev.codfw.wmnet
* 15:52 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:52 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2002-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 15:47 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2002-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 15:43 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 15:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:40 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:36 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudgw2002-dev.codfw.wmnet
* 15:35 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 15:33 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 15:27 ebernhardson@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:26 ebernhardson@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:19 moritzm: reuploadd libxml2 2.9.10+dfsg-6.7+deb11u9+wmf11u1 and 72.1-3+deb12u1~wmf11u1 to component/php83-icu72 for bullseye-wikimedia [[phab:T419058|T419058]]
* 15:14 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:13 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:13 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=tcp-proxy4004.ulsfo.wmnet
* 15:13 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=tcp-proxy4004.ulsfo.wmnet
* 15:12 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=tcp-proxy4003.ulsfo.wmnet
* 15:12 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=tcp-proxy4003.ulsfo.wmnet
* 15:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 15:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:57 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:56 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:45 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:44 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:34 andrew@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 14:33 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:31 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 14:31 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1018
* 14:31 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1018
* 14:25 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:24 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:20 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet
* 14:15 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 24 hosts with reason: Switch BGP bounce
* 14:12 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet
* 14:09 mlitn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251034{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251035{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251036{{!}}Remove queueing logic (T419587)]], [[gerrit:1251037{{!}}Remove queueing logic (T419587)]] (duration: 07m 15s)
* 14:08 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 14:07 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:07 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:05 mlitn@deploy2002: mlitn: Continuing with sync
* 14:04 mlitn@deploy2002: mlitn: Backport for [[gerrit:1251034{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251035{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251036{{!}}Remove queueing logic (T419587)]], [[gerrit:1251037{{!}}Remove queueing logic (T419587)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:03 XioNoX: start eqiad rack D2 depools
* 14:02 mlitn@deploy2002: Started scap sync-world: Backport for [[gerrit:1251034{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251035{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251036{{!}}Remove queueing logic (T419587)]], [[gerrit:1251037{{!}}Remove queueing logic (T419587)]]
* 13:59 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:59 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:57 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:54 moritzm: installing libssh security updates
* 13:54 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:45 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251031{{!}}ext.testKitchen: Depend on mediawiki.user module]], [[gerrit:1251048{{!}}Add title to the request context in FlaggedRevsCacheTest (T419539)]], [[gerrit:1251032{{!}}ext.testKitchen: Depend on mediawiki.user module]] (duration: 08m 01s)
* 13:42 phuedx@deploy2002: phuedx: Continuing with sync
* 13:39 phuedx@deploy2002: phuedx: Backport for [[gerrit:1251031{{!}}ext.testKitchen: Depend on mediawiki.user module]], [[gerrit:1251048{{!}}Add title to the request context in FlaggedRevsCacheTest (T419539)]], [[gerrit:1251032{{!}}ext.testKitchen: Depend on mediawiki.user module]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:37 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1251031{{!}}ext.testKitchen: Depend on mediawiki.user module]], [[gerrit:1251048{{!}}Add title to the request context in FlaggedRevsCacheTest (T419539)]], [[gerrit:1251032{{!}}ext.testKitchen: Depend on mediawiki.user module]]
* 13:26 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251005{{!}}Deploy EditCheck suggestion mode at all Wikipedias (T415320)]] (duration: 06m 42s)
* 13:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 esanders@deploy2002: esanders: Continuing with sync
* 13:22 esanders@deploy2002: esanders: Backport for [[gerrit:1251005{{!}}Deploy EditCheck suggestion mode at all Wikipedias (T415320)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1178.eqiad.wmnet
* 13:21 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:20 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1251005{{!}}Deploy EditCheck suggestion mode at all Wikipedias (T415320)]]
* 13:18 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250656{{!}}Add multilingual revert risk host header for LiftWing requests (T419718)]] (duration: 10m 52s)
* 13:14 fnegri@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.add-wiki (exit_code=99) for database kaiwiki ([[phab:T414240|T414240]])
* 13:14 fnegri@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database kaiwiki ([[phab:T414240|T414240]])
* 13:14 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 13:14 kgraessle@deploy2002: kgraessle: Continuing with sync
* 13:12 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 13:11 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1250656{{!}}Add multilingual revert risk host header for LiftWing requests (T419718)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 13:07 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1250656{{!}}Add multilingual revert risk host header for LiftWing requests (T419718)]]
* 13:05 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1159.eqiad.wmnet
* 13:03 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 13:02 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 13:02 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:02 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:57 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1159.eqiad.wmnet
* 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1013.eqiad.wmnet
* 12:49 dpogorzelski@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: sync
* 12:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1013.eqiad.wmnet
* 12:49 dpogorzelski@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: sync
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:31 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum4004.ulsfo.wmnet
* 12:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4004.ulsfo.wmnet
* 12:28 moritzm: installing postgresql-17 security updates
* 12:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4004.ulsfo.wmnet
* 12:14 moritzm: installing wireshark security updates
* 12:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1013.eqiad.wmnet with reason: host reimage
* 12:07 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1013.eqiad.wmnet with reason: host reimage
* 11:52 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:51 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:50 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host tcp-proxy4004.ulsfo.wmnet
* 11:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy4004.ulsfo.wmnet with OS trixie
* 11:49 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy4004.ulsfo.wmnet with reason: host reimage
* 11:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy4004.ulsfo.wmnet with reason: host reimage
* 11:19 jayme: disabled puppet on all wikikube worker nodes to rollout/test new apparmor profiles in staging - [[phab:T419781|T419781]]
* 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy4004.ulsfo.wmnet with OS trixie
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy4004.ulsfo.wmnet on all recursors
* 11:06 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy4004.ulsfo.wmnet on all recursors
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 11:03 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:00 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 10:42 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device asw1-23-ulsfo
* 10:41 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 10:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1013.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 10:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4003.ulsfo.wmnet
* 10:31 vgutierrez@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4001.ulsfo.wmnet
* 10:31 vgutierrez@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4002.ulsfo.wmnet
* 10:31 vgutierrez@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4004.ulsfo.wmnet
* 10:30 vgutierrez@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4003.ulsfo.wmnet
* 10:30 vgutierrez: repooling ncredir4003 & ncredir4004
* 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4003.ulsfo.wmnet
* 10:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy4004.ulsfo.wmnet
* 10:26 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1013.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 10:26 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:25 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1013
* 10:22 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1013
* 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host tcp-proxy4003.ulsfo.wmnet
* 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy4003.ulsfo.wmnet with OS trixie
* 10:12 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1011.eqiad.wmnet
* 10:12 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:11 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:11 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:10 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:09 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1011.eqiad.wmnet
* 10:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1010.eqiad.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy4003.ulsfo.wmnet with reason: host reimage
* 10:00 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1010.eqiad.wmnet
* 09:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy4003.ulsfo.wmnet with reason: host reimage
* 09:48 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/SERVICE_NAME: apply
* 09:48 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/SERVICE_NAME: apply
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2024.codfw.wmnet
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2023.codfw.wmnet
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2022.codfw.wmnet
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2021.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2024.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2023.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2022.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2021.codfw.wmnet
* 09:39 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 09:39 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 09:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 09:39 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 09:38 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 09:38 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 09:35 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>ms-fe[2009-2020].codfw.wmnet<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 09:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host durum4004.ulsfo.wmnet
* 09:32 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 09:32 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy4003.ulsfo.wmnet with OS trixie
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy4003.ulsfo.wmnet on all recursors
* 09:30 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy4003.ulsfo.wmnet on all recursors
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:28 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P<nowiki>{</nowiki>ms-fe[2009-2020].codfw.wmnet<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 09:28 Emperor: roll-restart codfw ms frontends prior to pooling new ones [[phab:T416243|T416243]]
* 09:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum4003.ulsfo.wmnet
* 09:23 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:23 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy4003.ulsfo.wmnet
* 09:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host durum4003.ulsfo.wmnet
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts netflow4002.ulsfo.wmnet
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:51 slyngshede@dns1004: END - running authdns-update
* 08:50 slyngshede@dns1004: START - running authdns-update
* 08:37 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:32 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts netflow4002.ulsfo.wmnet
* 08:25 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 08:23 arnaudb@dns1004: END - running authdns-update
* 08:21 arnaudb@dns1004: START - running authdns-update
* 07:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir4004.ulsfo.wmnet
* 07:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncredir4004.ulsfo.wmnet
* 07:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir4003.ulsfo.wmnet
* 07:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncredir4003.ulsfo.wmnet
* 05:24 kart_: staging: machinetranslation: Optimize model loading and memory footprints ([[phab:T411058|T411058]])
* 05:19 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
* 05:16 kartik@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
* 02:16 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2005.codfw.wmnet with OS trixie
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 14s)
* 02:03 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:59 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
* 01:52 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
* 01:49 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:47 swfrench-wmf: reprepro include php-apcu_5.1.24-1+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:37 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2005.codfw.wmnet with OS trixie
* 01:36 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2004.codfw.wmnet with OS trixie
* 01:24 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7012.*
* 01:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 01:18 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
* 01:18 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 01:15 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
* 01:13 swfrench-wmf: reprepro include dh-php_5.5+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:08 swfrench-wmf: reprepro include php-defaults_94+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:05 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 01:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 01:03 swfrench-wmf: reprepro include php8.3_8.3.30-1+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:00 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2004.codfw.wmnet with OS trixie
* 00:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7012.magru.wmnet with OS trixie
* 00:59 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
* 00:58 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
* 00:38 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 00:38 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 00:37 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 00:37 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 00:36 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 00:36 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 00:33 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 00:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7012.magru.wmnet with reason: host reimage
* 00:27 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 00:24 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7012.magru.wmnet with reason: host reimage
* 00:03 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7012.magru.wmnet with OS trixie
== 2026-03-11 ==
* 23:56 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7009.*
* 22:52 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:52 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:45 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 22:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 22:29 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 22:29 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 22:27 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7009.magru.wmnet with OS trixie
* 21:56 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 21:55 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 21:54 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250051{{!}}OrchestratorRequest: Switch evaluations to v2 endpoint (T413727)]] (duration: 18m 19s)
* 21:47 jforrester@deploy2002: jforrester: Continuing with sync
* 21:43 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7009.magru.wmnet with reason: host reimage
* 21:42 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:40 jforrester@deploy2002: jforrester: Backport for [[gerrit:1250051{{!}}OrchestratorRequest: Switch evaluations to v2 endpoint (T413727)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:39 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7009.magru.wmnet with reason: host reimage
* 21:35 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1250051{{!}}OrchestratorRequest: Switch evaluations to v2 endpoint (T413727)]]
* 21:30 rzl: rzl@apt1002:~$ sudo -i reprepro -C component/envoy-future include bullseye-wikimedia /home/rzl/envoyproxy_1.35.9-1_amd64.changes
* 21:29 arlolra@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250665{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]], [[gerrit:1250666{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]] (duration: 35m 16s)
* 21:16 arlolra@deploy2002: arlolra: Continuing with sync
* 21:15 arlolra@deploy2002: arlolra: Backport for [[gerrit:1250665{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]], [[gerrit:1250666{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:08 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7009.magru.wmnet with OS trixie
* 21:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7010.*
* 21:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7010.magru.wmnet with OS trixie
* 20:54 arlolra@deploy2002: Started scap sync-world: Backport for [[gerrit:1250665{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]], [[gerrit:1250666{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]]
* 20:47 jsn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250579{{!}}urwikisource: add logo, sitename and projectnamespace (T415974)]] (duration: 06m 55s)
* 20:43 jsn@deploy2002: anzx, jsn: Continuing with sync
* 20:42 jsn@deploy2002: anzx, jsn: Backport for [[gerrit:1250579{{!}}urwikisource: add logo, sitename and projectnamespace (T415974)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:40 jsn@deploy2002: Started scap sync-world: Backport for [[gerrit:1250579{{!}}urwikisource: add logo, sitename and projectnamespace (T415974)]]
* 20:38 jsn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250581{{!}}riskyArticleEdits: show page descriptions (T419442)]], [[gerrit:1250582{{!}}Fix Instrumentation on mobile view (T419517)]], [[gerrit:1250632{{!}}ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570)]] (duration: 10m 37s)
* 20:38 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ml-serve1014.eqiad.wmnet with reason: [[phab:T400626|T400626]]
* 20:37 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7010.magru.wmnet with reason: host reimage
* 20:34 jsn@deploy2002: jsn, sfaci: Continuing with sync
* 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search-test: apply
* 20:33 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search-test: apply
* 20:32 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7010.magru.wmnet with reason: host reimage
* 20:30 jsn@deploy2002: jsn, sfaci: Backport for [[gerrit:1250581{{!}}riskyArticleEdits: show page descriptions (T419442)]], [[gerrit:1250582{{!}}Fix Instrumentation on mobile view (T419517)]], [[gerrit:1250632{{!}}ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:28 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on gitlab1003.wikimedia.org with reason: Upgrade
* 20:28 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on gitlab2002.wikimedia.org with reason: Upgrade
* 20:27 jsn@deploy2002: Started scap sync-world: Backport for [[gerrit:1250581{{!}}riskyArticleEdits: show page descriptions (T419442)]], [[gerrit:1250582{{!}}Fix Instrumentation on mobile view (T419517)]], [[gerrit:1250632{{!}}ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570)]]
* 20:21 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:18 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 20:17 bvibber@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250647{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]], [[gerrit:1250648{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]] (duration: 06m 47s)
* 20:13 bvibber@deploy2002: bvibber: Continuing with sync
* 20:12 bvibber@deploy2002: bvibber: Backport for [[gerrit:1250647{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]], [[gerrit:1250648{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 bvibber@deploy2002: Started scap sync-world: Backport for [[gerrit:1250647{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]], [[gerrit:1250648{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]]
* 19:59 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7010.magru.wmnet with OS trixie
* 19:54 andrew@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:51 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 19:37 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-backup1004.eqiad.wmnet with OS trixie
* 19:01 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp7011.magru.wmnet
* 19:01 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7011.magru.wmnet
* 18:56 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough
* 18:49 brennen@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 18:49 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:49 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:45 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:45 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:44 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:44 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:43 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough
* 18:42 brennen: 1.46.0-wmf.19 train status: no current blockers, going ahead to group1.
* 18:39 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2332.codfw.wmnet
* 18:37 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2332.codfw.wmnet
* 18:20 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7011.*
* 18:18 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 18:16 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-backup1004.eqiad.wmnet with OS trixie
* 18:13 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 17:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1010.eqiad.wmnet with reason: host reimage
* 17:52 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:52 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:48 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:47 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1010.eqiad.wmnet with reason: host reimage
* 17:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 17:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 17:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:38 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:38 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:36 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:36 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:36 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:35 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 17:34 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 17:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
* 17:31 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
* 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:28 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:28 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:20 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:19 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:19 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:18 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:15 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:13 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:12 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:09 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:09 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:02 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7011.magru.wmnet with OS trixie
* 17:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 17:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 16:58 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum4004.ulsfo.wmnet with reason: in setup
* 16:58 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum4003.ulsfo.wmnet with reason: in setup
* 16:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 16:40 root@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:40 root@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moving many things from cloudgw2002-dev to cloudgw2004-dev - root@cumin2002"
* 16:40 root@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moving many things from cloudgw2002-dev to cloudgw2004-dev - root@cumin2002"
* 16:39 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 16:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 16:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7011.magru.wmnet with reason: host reimage
* 16:35 root@cumin2002: START - Cookbook sre.dns.netbox
* 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus4002.ulsfo.wmnet
* 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - tappof@cumin1003"
* 16:30 tappof@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - tappof@cumin1003"
* 16:30 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7011.magru.wmnet with reason: host reimage
* 16:25 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 16:23 tappof@cumin1003: START - Cookbook sre.dns.netbox
* 16:18 tappof@cumin1003: START - Cookbook sre.hosts.decommission for hosts prometheus4002.ulsfo.wmnet
* 15:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:52 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7011.magru.wmnet with OS trixie
* 15:51 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 15:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 15:50 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:49 urbanecm@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:48 sukhe: sudo cumin -b1 -s10 "C:dnsrecursor" "run-puppet-agent --enable 'merging CR 1250576'"
* 15:48 urbanecm@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:46 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:45 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 15:43 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:39 sukhe: sudo cumin "C:dnsrecursor" "disable-puppet 'merging CR 1250576'"
* 15:35 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:26 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:08 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 15:08 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 15:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:53 swfrench-wmf: updated component/php83-icu72 with libpcre2 10.42-1~wmf11+1 from apt-staging - [[phab:T419058|T419058]]
* 14:46 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:45 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum4004.ulsfo.wmnet
* 14:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum4004.ulsfo.wmnet with OS trixie
* 14:39 vgutierrez: depool ncredir4003 && ncredir4004
* 14:38 vgutierrez: repool ncredir4001 && ncredir4002
* 14:31 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4002.ulsfo.wmnet
* 14:31 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4001.ulsfo.wmnet
* 14:30 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4004.ulsfo.wmnet
* 14:30 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=ncredir4004.ulsfo.wmnet
* 14:27 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4003.ulsfo.wmnet
* 14:27 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=ncredir4003.ulsfo.wmnet
* 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4004.ulsfo.wmnet with reason: host reimage
* 14:23 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:23 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:19 moritzm: installing python-urllib3 security updates
* 14:18 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum4004.ulsfo.wmnet with reason: host reimage
* 14:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:13 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:12 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:12 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:12 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:12 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:11 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:11 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:11 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:08 gkyziridis@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:08 gkyziridis@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:07 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250568{{!}}Fix pinnableElement export (T419620)]] (duration: 06m 26s)
* 14:06 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:04 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:04 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:03 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:03 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 14:03 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:02 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1250568{{!}}Fix pinnableElement export (T419620)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:00 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1250568{{!}}Fix pinnableElement export (T419620)]]
* 13:58 moritzm: uploaded libxml2 2.9.10+dfsg-6.7+deb11u9+wmf11u1 to component/php83-icu72 for bullseye-wikimedia (special build of libxml with ICU disabled to ensure co-installabiliy between icu 67 and icu 72) [[phab:T419058|T419058]]
* 13:57 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250566{{!}}Restore advanced main menu for AMC (T413912)]] (duration: 10m 44s)
* 13:55 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum4004.ulsfo.wmnet with OS trixie
* 13:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:54 vgutierrez: repool cp7016
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum4004.ulsfo.wmnet on all recursors
* 13:54 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum4004.ulsfo.wmnet on all recursors
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:51 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 13:50 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1250566{{!}}Restore advanced main menu for AMC (T413912)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:49 vgutierrez: depool cp7016
* 13:49 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:46 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1250566{{!}}Restore advanced main menu for AMC (T413912)]]
* 13:45 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:44 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:44 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247547{{!}}Remove `MetricsPlatform` configuration from production (T416865)]] (duration: 35m 52s)
* 13:43 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 13:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4004.ulsfo.wmnet with OS bookworm
* 13:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:36 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum4004.ulsfo.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum4003.ulsfo.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum4003.ulsfo.wmnet with OS trixie
* 13:36 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 13:35 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 13:30 jdlrobson@deploy2002: jdlrobson, sfaci: Continuing with sync
* 13:29 jdlrobson@deploy2002: jdlrobson, sfaci: Backport for [[gerrit:1247547{{!}}Remove `MetricsPlatform` configuration from production (T416865)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4003.ulsfo.wmnet with reason: host reimage
* 13:18 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 13:13 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum4003.ulsfo.wmnet with reason: host reimage
* 13:08 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1247547{{!}}Remove `MetricsPlatform` configuration from production (T416865)]]
* 13:00 moritzm: installing libcommons-lang3-java security updates
* 12:57 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4004.ulsfo.wmnet with OS bookworm
* 12:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4003.ulsfo.wmnet with OS bookworm
* 12:46 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum4003.ulsfo.wmnet with OS trixie
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum4003.ulsfo.wmnet on all recursors
* 12:45 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum4003.ulsfo.wmnet on all recursors
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:41 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:37 moritzm: installing inetutils security updates
* 12:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:36 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum4003.ulsfo.wmnet
* 12:35 tappof: completed migration from prometheus4002 to prometheus4003 (ulsfo) (TT419430)
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 12:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 12:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 12:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2073.codfw.wmnet with OS bullseye
* 12:23 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 12:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet
* 12:18 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 12:17 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1011
* 12:17 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1011
* 12:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2072.codfw.wmnet with OS bullseye
* 12:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet
* 12:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2073.codfw.wmnet with reason: host reimage
* 12:04 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4003.ulsfo.wmnet with OS bookworm
* 12:01 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
* 11:59 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
* 11:58 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2073.codfw.wmnet with reason: host reimage
* 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2072.codfw.wmnet with reason: host reimage
* 11:48 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2072.codfw.wmnet with reason: host reimage
* 11:41 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1239954{{!}}[Growth] Enable on every new Wikipedia by default (T304052)]] (duration: 06m 39s)
* 11:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2073
* 11:38 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2073
* 11:37 vgutierrez: upgrading to acme-chief 0.39 on acme-chief production instances - [[phab:T419352|T419352]]
* 11:37 urbanecm@deploy2002: urbanecm: Continuing with sync
* 11:36 urbanecm@deploy2002: urbanecm: Backport for [[gerrit:1239954{{!}}[Growth] Enable on every new Wikipedia by default (T304052)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:36 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2073
* 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2073.codfw.wmnet 212.48.192.10.in-addr.arpa 2.1.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:36 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2073.codfw.wmnet 212.48.192.10.in-addr.arpa 2.1.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2073 - mvernon@cumin2002"
* 11:36 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2073 - mvernon@cumin2002"
* 11:35 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 11:34 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 11:34 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1239954{{!}}[Growth] Enable on every new Wikipedia by default (T304052)]]
* 11:34 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 11:34 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250539{{!}}[Growth] kaiwiki: Enable GrowthExperiments (T304052)]] (duration: 14m 11s)
* 11:33 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 11:33 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 11:32 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 11:32 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:31 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2073
* 11:30 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2073.codfw.wmnet with OS bullseye
* 11:30 urbanecm@deploy2002: urbanecm: Continuing with sync
* 11:29 cgoubert@dns1004: END - running authdns-update
* 11:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2072
* 11:29 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2072
* 11:28 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2072
* 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2072.codfw.wmnet 158.32.192.10.in-addr.arpa 8.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:28 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2072.codfw.wmnet 158.32.192.10.in-addr.arpa 8.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2072 - mvernon@cumin2002"
* 11:28 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2072 - mvernon@cumin2002"
* 11:28 cgoubert@dns1004: START - running authdns-update
* 11:26 urbanecm@deploy2002: mwscript-k8s job started: WikimediaMaintenance:createExtensionTables.php --wiki=kaiwiki growthexperiments # [[phab:T304052|T304052]]
* 11:24 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:24 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2072
* 11:23 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2072.codfw.wmnet with OS bullseye
* 11:22 tappof@dns1004: END - running authdns-update
* 11:22 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 11:21 urbanecm@deploy2002: urbanecm: Backport for [[gerrit:1250539{{!}}[Growth] kaiwiki: Enable GrowthExperiments (T304052)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:21 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 11:21 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
* 11:21 tappof@dns1004: START - running authdns-update
* 11:21 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
* 11:19 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1250539{{!}}[Growth] kaiwiki: Enable GrowthExperiments (T304052)]]
* 11:19 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 11:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2071.codfw.wmnet with OS bullseye
* 11:18 urbanecm@deploy2002: mwscript-k8s job started: WikimediaMaintenance:createExtensionTables.php --wiki=kaiwiki growthexperiments # [[phab:T304052|T304052]]
* 11:10 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
* 11:10 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
* 11:08 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 11:08 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 11:05 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 11:05 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 10:58 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2071.codfw.wmnet with reason: host reimage
* 10:54 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2071.codfw.wmnet with reason: host reimage
* 10:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2071
* 10:35 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2071
* 10:34 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2071
* 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2071.codfw.wmnet 221.16.192.10.in-addr.arpa 1.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:34 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2071.codfw.wmnet 221.16.192.10.in-addr.arpa 1.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2071 - mvernon@cumin2002"
* 10:34 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2071 - mvernon@cumin2002"
* 10:26 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 10:24 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2071
* 10:23 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2071.codfw.wmnet with OS bullseye
* 10:08 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2095.codfw.wmnet with OS bullseye
* 10:03 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Failed step after ml-serve1015's reimage - elukey@cumin1003"
* 10:02 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Failed step after ml-serve1015's reimage - elukey@cumin1003"
* 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1015.eqiad.wmnet with OS trixie
* 10:01 elukey@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 09:59 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2096.codfw.wmnet with OS bullseye
* 09:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2096.codfw.wmnet with OS bullseye
* 09:52 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:51 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2095.codfw.wmnet with OS bullseye
* 09:51 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:46 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 09:34 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 09:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 09:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 09:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 09:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 09:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 09:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 09:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 09:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 09:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 09:28 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 09:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 09:27 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 09:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 09:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 09:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 09:24 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 09:22 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir4004.ulsfo.wmnet
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4004.ulsfo.wmnet with OS bookworm
* 09:15 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:15 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:14 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 09:10 javiermonton@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249217{{!}}stream: mediawiki.page_html_content_change (T419258)]] (duration: 08m 28s)
* 09:07 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2096.codfw.wmnet with OS bullseye
* 09:06 javiermonton@deploy2002: javiermonton: Continuing with sync
* 09:03 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
* 09:03 javiermonton@deploy2002: javiermonton: Backport for [[gerrit:1249217{{!}}stream: mediawiki.page_html_content_change (T419258)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 09:01 javiermonton@deploy2002: Started scap sync-world: Backport for [[gerrit:1249217{{!}}stream: mediawiki.page_html_content_change (T419258)]]
* 08:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1015.eqiad.wmnet with reason: host reimage
* 08:58 trueg@deploy2002: helmfile [staging] DONE helmfile.d/services/SERVICE_NAME: apply
* 08:58 trueg@deploy2002: helmfile [staging] START helmfile.d/services/SERVICE_NAME: apply
* 08:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 08:55 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2239.codfw.wmnet with reason: mysql upgrade / restart
* 08:54 moritzm: installing imagemagick security updates
* 08:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1015.eqiad.wmnet with reason: host reimage
* 08:41 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1015.eqiad.wmnet with OS trixie
* 08:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1014.eqiad.wmnet with OS trixie
* 08:40 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:39 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:35 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4004.ulsfo.wmnet with OS bookworm
* 08:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir4004.ulsfo.wmnet on all recursors
* 08:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir4004.ulsfo.wmnet on all recursors
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1014.eqiad.wmnet with reason: host reimage
* 08:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:21 Msz2001: UTC morning backport window finished
* 08:21 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:21 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir4004.ulsfo.wmnet
* 08:21 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250426{{!}}Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages]] (duration: 10m 46s)
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir4003.ulsfo.wmnet
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4003.ulsfo.wmnet with OS bookworm
* 08:17 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1014.eqiad.wmnet with reason: host reimage
* 08:15 mszwarc@deploy2002: mszwarc: Continuing with sync
* 08:14 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1250426{{!}}Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:10 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1250426{{!}}Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages]]
* 08:09 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249921{{!}}Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422)]], [[gerrit:1250066{{!}}Send2FAWarningNotifications: Support reading users from file (T419111)]] (duration: 33m 07s)
* 08:05 moritzm: installing mariadb bugfix updates from Bookworm point release (tools and libraries as packaged in Debian, unrelated to the wmf-mariadb packages)
* 08:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1014.eqiad.wmnet with OS trixie
* 08:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 07:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 07:57 mszwarc@deploy2002: mszwarc: Continuing with sync
* 07:56 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1249921{{!}}Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422)]], [[gerrit:1250066{{!}}Send2FAWarningNotifications: Support reading users from file (T419111)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:44 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1049.eqiad.wmnet
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 07:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 07:38 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4003.ulsfo.wmnet with OS bookworm
* 07:36 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1249921{{!}}Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422)]], [[gerrit:1250066{{!}}Send2FAWarningNotifications: Support reading users from file (T419111)]]
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir4003.ulsfo.wmnet on all recursors
* 07:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir4003.ulsfo.wmnet on all recursors
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir4003.ulsfo.wmnet
* 07:22 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247639{{!}}Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727)]] (duration: 12m 24s)
* 07:18 kgraessle@deploy2002: kgraessle: Continuing with sync
* 07:12 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1247639{{!}}Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:09 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1247639{{!}}Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 59s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 00:33 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250117{{!}}Stop setting $wgImageLinksSchemaMigrationStage (T299953)]] (duration: 09m 38s)
* 00:29 zabe@deploy2002: zabe: Continuing with sync
* 00:26 zabe@deploy2002: zabe: Backport for [[gerrit:1250117{{!}}Stop setting $wgImageLinksSchemaMigrationStage (T299953)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:24 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1250117{{!}}Stop setting $wgImageLinksSchemaMigrationStage (T299953)]]
* 00:03 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 00:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint1003.wikimedia.org with OS trixie
* 00:03 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:03 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
== 2026-03-10 ==
* 23:58 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 23:53 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 23:49 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 23:44 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint1003.wikimedia.org with reason: host reimage
* 23:40 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on contint1003.wikimedia.org with reason: host reimage
* 23:31 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2096.codfw.wmnet with OS bullseye
* 23:31 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
* 23:26 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2095.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:24 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2096.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:22 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:11 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2096.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:05 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 23:05 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:59 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2095.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:39 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:38 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:51 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp7012.magru.wmnet with OS trixie
* 21:48 Dreamy_Jazz: Evening UTC backport window done
* 21:42 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7006.magru.wmnet [reason: trixie reimaging]
* 21:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7006.magru.wmnet with OS trixie
* 21:25 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1235552{{!}}Migrate EmailAuth, step 2 (T404334)]] (duration: 25m 34s)
* 21:24 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7007.magru.wmnet [reason: trixie reimaging]
* 21:22 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7007.magru.wmnet with OS trixie
* 21:21 tgr@deploy2002: tgr: Continuing with sync
* 21:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 21:09 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 21:02 tgr@deploy2002: tgr: Backport for [[gerrit:1235552{{!}}Migrate EmailAuth, step 2 (T404334)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:00 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1235552{{!}}Migrate EmailAuth, step 2 (T404334)]]
* 20:59 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7012.magru.wmnet with OS trixie
* 20:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7007.magru.wmnet with reason: host reimage
* {{safesubst:SAL entry|1=20:50 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249983{{!}}Deploy participant recruitment survey on ptwiki and trwiki (T419275)]], [[gerrit:1238733{{!}}wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402)]], [[gerrit:1238734{{!}}wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403)]], [[gerrit:1249393{{!}}build: Upgrade mediawiki-phan-config from 0.18.0 to 0.2}}
* 20:48 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7007.magru.wmnet with reason: host reimage
* 20:46 jforrester@deploy2002: dani, jforrester: Continuing with sync
* {{safesubst:SAL entry|1=20:45 jforrester@deploy2002: dani, jforrester: Backport for [[gerrit:1249983{{!}}Deploy participant recruitment survey on ptwiki and trwiki (T419275)]], [[gerrit:1238733{{!}}wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402)]], [[gerrit:1238734{{!}}wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403)]], [[gerrit:1249393{{!}}build: Upgrade mediawiki-phan-config from 0.18.0 to 0.20.0 (T41}}
* 20:43 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7006.magru.wmnet with OS trixie
* {{safesubst:SAL entry|1=20:43 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1249983{{!}}Deploy participant recruitment survey on ptwiki and trwiki (T419275)]], [[gerrit:1238733{{!}}wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402)]], [[gerrit:1238734{{!}}wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403)]], [[gerrit:1249393{{!}}build: Upgrade mediawiki-phan-config from 0.18.0 to 0.20}}
* 20:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7006.magru.wmnet with OS trixie
* 20:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - cdobbins@cumin2002"
* 20:38 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240012{{!}}Enable personal main menu to all users in Minerva Neue skin (T413912)]], [[gerrit:1250007{{!}}Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592)]], [[gerrit:1250015{{!}}Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439)]] (duration: 12m 58s)
* 20:36 cdobbins@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - cdobbins@cumin2002"
* 20:34 jforrester@deploy2002: jforrester, cscott, bwang: Continuing with sync
* 20:27 jforrester@deploy2002: jforrester, cscott, bwang: Backport for [[gerrit:1240012{{!}}Enable personal main menu to all users in Minerva Neue skin (T413912)]], [[gerrit:1250007{{!}}Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592)]], [[gerrit:1250015{{!}}Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439)]] synced to the testservers (see https://wikitech.wi
* 20:25 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1240012{{!}}Enable personal main menu to all users in Minerva Neue skin (T413912)]], [[gerrit:1250007{{!}}Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592)]], [[gerrit:1250015{{!}}Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439)]]
* 20:25 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7007.magru.wmnet with OS trixie
* 20:24 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7007.magru.wmnet [reason: trixie reimaging]
* 20:24 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7005.magru.wmnet [reason: trixie reimaging]
* 20:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7005.magru.wmnet with OS trixie
* 20:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 20:03 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7013.*
* 20:03 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 19:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7013.magru.wmnet with OS trixie
* 19:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7005.magru.wmnet with reason: host reimage
* 19:42 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7005.magru.wmnet with reason: host reimage
* 19:40 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7006.magru.wmnet with OS trixie
* 19:40 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7006.magru.wmnet [reason: trixie reimaging]
* 19:39 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7004.magru.wmnet [reason: trixie reimaging]
* 19:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7013.magru.wmnet with reason: host reimage
* 19:19 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7005.magru.wmnet with OS trixie
* 19:19 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7004.magru.wmnet with OS trixie
* 19:19 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7005.magru.wmnet [reason: trixie reimaging]
* 19:18 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7013.magru.wmnet with reason: host reimage
* 19:17 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
* 19:16 brennen@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 19:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7003.magru.wmnet with OS trixie
* 19:09 brennen: 1.46.0-wmf.19 train status: blockers believed resolved, rolling to group0
* 19:07 brennen@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249937{{!}}Re-add correct namespace for translatable pages (T419294)]] (duration: 12m 30s)
* 19:06 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 19:01 brennen@deploy2002: abi, brennen: Continuing with sync
* 18:58 brennen@deploy2002: abi, brennen: Backport for [[gerrit:1249937{{!}}Re-add correct namespace for translatable pages (T419294)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:55 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7013.magru.wmnet with OS trixie
* 18:54 brennen@deploy2002: Started scap sync-world: Backport for [[gerrit:1249937{{!}}Re-add correct namespace for translatable pages (T419294)]]
* 18:52 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7004.magru.wmnet with reason: host reimage
* 18:52 brennen@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.19 refs [[phab:T413810|T413810]] (duration: 38m 34s)
* 18:49 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7004.magru.wmnet with reason: host reimage
* 18:47 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7003.magru.wmnet with reason: host reimage
* 18:44 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7003.magru.wmnet with reason: host reimage
* 18:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7015.*
* 18:27 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7015.magru.wmnet with OS trixie
* 18:23 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7004.magru.wmnet with OS trixie
* 18:21 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7004.magru.wmnet [reason: trixie reimaging]
* 18:16 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7003.magru.wmnet with OS trixie
* 18:13 brennen@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
* 18:00 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:59 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7015.magru.wmnet with reason: host reimage
* 17:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 17:59 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 17:56 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7015.magru.wmnet with reason: host reimage
* 17:54 hashar@deploy2002: Finished deploy [integration/docroot@f544f49]: Catch up with composer/npm dev dependencies. Noop for production (duration: 00m 11s)
* 17:54 hashar@deploy2002: Started deploy [integration/docroot@f544f49]: Catch up with composer/npm dev dependencies. Noop for production
* 17:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 17:43 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 17:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:32 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:31 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:30 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7015.magru.wmnet with OS trixie
* 17:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:29 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:28 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:28 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 17:28 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:26 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:23 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 17:22 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:21 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:12 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:12 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:11 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:11 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:09 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:09 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:01 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 16:40 andrew@dns1004: END - running authdns-update
* 16:38 andrew@dns1004: START - running authdns-update
* 16:25 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249993{{!}}Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled"]] (duration: 07m 45s)
* 16:21 reedy@deploy2002: reedy: Continuing with sync
* 16:19 reedy@deploy2002: reedy: Backport for [[gerrit:1249993{{!}}Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:17 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1249993{{!}}Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled"]]
* 15:59 jynus@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 15:59 taavi: update cr firewall policy for codfw1dev ldap tree https://gerrit.wikimedia.org/r/1249985
* 15:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-fr-tech: apply
* 15:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-fr-tech: apply
* 15:55 jynus@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 15:48 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:28 brouberol@dns1004: END - running authdns-update
* 15:27 brouberol@dns1004: START - running authdns-update
* 15:10 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002"
* 15:10 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002
* 15:09 swfrench@cumin2002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002
* 15:09 swfrench@cumin2002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002"
* 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:59 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:58 sukhe: sudo cumin -b1 -s15 "C:bird" "run-puppet-agent --enable 'merging CR 1238007; add function return type'"
* 14:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:58 sukhe: sudo cumin -b1 -s15 "C:bird" "run-puppet-agent 'merging CR 1238007; add function return type'"
* 14:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:42 sukhe: sudo cumin "C:bird" "disable-puppet 'merging CR 1238007; add function return type'"
* 14:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host ml-serve1014
* 14:39 elukey@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:36 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host ml-serve1014
* 14:36 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.powercycle (exit_code=99) for host ml-serve1014
* 14:36 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host ml-serve1014
* 14:12 otto@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249367{{!}}stream: mediawiki.page_edit_type_simple.dev0 (T351225)]] (duration: 11m 05s)
* 14:08 otto@deploy2002: akhatun, otto: Continuing with sync
* 14:02 otto@deploy2002: akhatun, otto: Backport for [[gerrit:1249367{{!}}stream: mediawiki.page_edit_type_simple.dev0 (T351225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:01 otto@deploy2002: Started scap sync-world: Backport for [[gerrit:1249367{{!}}stream: mediawiki.page_edit_type_simple.dev0 (T351225)]]
* 13:49 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 13:43 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:28 vgutierrez: testing acme-chief 0.39 in acmechief-test2001 - [[phab:T419352|T419352]]
* 13:27 vgutierrez: upload acme-chief 0.39 to bookworm-wikimedia (apt.wm.o) - [[phab:T419352|T419352]]
* 13:16 jiji@cumin1003: END (FAIL) - Cookbook sre.memcached.roll-reboot-restart (exit_code=1) rolling restart_daemons on A:memcached-canary
* 13:16 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling restart_daemons on A:memcached-canary
* 13:12 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249903{{!}}Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580)]], [[gerrit:1249035{{!}}kaiwiki: add logo, stiename, projectnamespace and timezone (T414237)]] (duration: 08m 45s)
* 13:08 mszwarc@deploy2002: mszwarc, anzx: Continuing with sync
* 13:05 mszwarc@deploy2002: mszwarc, anzx: Backport for [[gerrit:1249903{{!}}Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580)]], [[gerrit:1249035{{!}}kaiwiki: add logo, stiename, projectnamespace and timezone (T414237)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1249903{{!}}Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580)]], [[gerrit:1249035{{!}}kaiwiki: add logo, stiename, projectnamespace and timezone (T414237)]]
* 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 12:57 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1015.eqiad.wmnet with OS bookworm
* 12:56 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 12:51 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1014.eqiad.wmnet with OS bookworm
* 12:50 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ml-serve1014
* 12:50 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ml-serve1014
* 12:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:49 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:49 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:47 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:45 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:44 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:42 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling restart_daemons on A:memcached-canary
* 12:42 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling restart_daemons on A:memcached-canary
* 12:31 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 12:10 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 12:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 11:59 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 11:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe2024.codfw.wmnet with OS bullseye
* 11:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1003"
* 11:17 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1003"
* 11:15 Emperor: rebalance codfw swift rings [[phab:T354872|T354872]]
* 10:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe2024.codfw.wmnet with reason: host reimage
* 10:47 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe2024.codfw.wmnet with reason: host reimage
* 10:31 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host ms-fe2024.codfw.wmnet with OS bullseye
* 10:30 ayounsi@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-fe2024.codfw.wmnet with OS bullseye
* 10:20 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host ms-fe2024.codfw.wmnet with OS bullseye
* 10:17 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cr2-eqdfw
* 09:31 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device cr2-eqdfw
* 09:22 derick@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=loginwiki --logwiki=metawiki TMPRI1975 FondueFanatic # [[phab:T419499|T419499]]
* 09:00 arnaudb@dns1005: END - running authdns-update
* 09:00 godog: restore all host interfaces - [[phab:T417393|T417393]]
* 08:58 arnaudb@dns1005: START - running authdns-update
* 08:30 godog: disabled interface for cloudcephmon1004 - [[phab:T417393|T417393]]
* 08:22 godog: disabled interfaces for cloudcephosd1021 cloudcephosd1042 cloudcephosd1043 cloudcephosd1018 cloudcephosd1022 - [[phab:T417393|T417393]]
* 08:18 godog: disabled interfaces for cloudcephosd1016 cloudcephosd1017 cloudcephosd1016 cloudcephosd1018 cloudcephosd1017 cloudcephosd1035 - [[phab:T417393|T417393]]
* 08:05 godog: start disabling cloudcephosd interfaces - [[phab:T417393|T417393]]
* 07:49 godog: prep cloudsw reboot tests 'ceph osd set noout' - [[phab:T417393|T417393]]
* 07:41 filippo@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 19 hosts with reason: switch down tests
* 06:14 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs2009.codfw.wmnet with OS bookworm
* 04:09 pt1979@cumin2002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device asw1-23-ulsfo
* 04:08 pt1979@cumin2002: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.16 (duration: 01m 48s)
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 10s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:37 ryankemper: [WDQS] [[phab:T410573|T410573]] repooled wdqs1011.eqiad.wmnet - erroneously depooled since `2025-11-19` by failed `sre.wdqs.reboot` cookbook
* 00:42 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 00:39 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
== 2026-03-09 ==
* 22:51 rzl: root@apt1002:~# reprepro --noskipold --restrict vopsbot update bookworm-wikimedia
* 22:34 bking@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM dse-k8s-ctrl1001.eqiad.wmnet
* 22:32 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1001.eqiad.wmnet
* 22:30 bking@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:29 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:03 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 22:02 alexsanford: Redeployed security fix for [[phab:T419186|T419186]]
* 21:44 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:40 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:37 cdobbins@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7002.magru.wmnet
* 21:34 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7002.magru.wmnet with OS trixie
* 21:29 alexsanford: Deployed security fix for [[phab:T419186|T419186]]
* 21:22 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 21:21 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 21:17 dani@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249370{{!}}Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]] (duration: 08m 15s)
* 21:13 dani@deploy2002: dani: Continuing with sync
* 21:11 dani@deploy2002: dani: Backport for [[gerrit:1249370{{!}}Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:09 dani@deploy2002: Started scap sync-world: Backport for [[gerrit:1249370{{!}}Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]]
* 21:08 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:05 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cp7002.magru.wmnet with reason: host reimage
* 21:02 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7002.magru.wmnet with reason: host reimage
* 21:01 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:01 tgr_: removed private code for [[phab:T397244|T397244]]
* 21:01 ryankemper: [WDQS] Alright, these are re-entering a failed state soon enough that we will need to identify the offender if we want to restore proper service. We could put some temporary hack to restart every few minutes so we at least maintain some uptime, but root cause is the usual 'we need a requestctl rule to block whoever's killing us' scenario
* 21:00 cdobbins@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7001.magru.wmnet [reason: Trixie reimaging]
* 20:57 ryankemper: [WDQS] Auto-remediation would have eventually restarted these, but some of them were staying below our current threshold of `threads > 1200`. May want to lower threshold, or examine an additional metric-type to look at in the future
* 20:56 ryankemper: [WDQS] `ryankemper@cumin2002:~$ sudo -E cumin 'A:wdqs-main AND P<nowiki>{</nowiki>wdqs1*<nowiki>}</nowiki>' 'systemctl restart wdqs-blazegraph'`
* 20:54 ryankemper: [WDQS] `ryankemper@cumin2002:~$ sudo -E cumin 'A:wdqs-main AND P<nowiki>{</nowiki>wdqs2*<nowiki>}</nowiki>' 'systemctl restart wdqs-blazegraph'`
* 20:44 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 20:43 tgr@deploy2002: Unlocked for deployment [MediaWiki]: working on private change (duration: 10m 10s)
* 20:36 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7002.magru.wmnet with OS trixie
* 20:33 tgr@deploy2002: Locking from deployment [MediaWiki]: working on private change
* 20:31 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247119{{!}}Enable parser survey for opted-out users on German/French/Polish wikis (T414852)]], [[gerrit:1249316{{!}}lift IP cap for womens month editathon (T419109)]] (duration: 13m 36s)
* 20:27 tgr@deploy2002: cscott, tgr, anzx: Continuing with sync
* 20:19 tgr@deploy2002: cscott, tgr, anzx: Backport for [[gerrit:1247119{{!}}Enable parser survey for opted-out users on German/French/Polish wikis (T414852)]], [[gerrit:1249316{{!}}lift IP cap for womens month editathon (T419109)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:17 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1247119{{!}}Enable parser survey for opted-out users on German/French/Polish wikis (T414852)]], [[gerrit:1249316{{!}}lift IP cap for womens month editathon (T419109)]]
* 20:13 aaron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249363{{!}}Remove redundant math spec file from wwwportal (T418188)]] (duration: 06m 56s)
* 20:09 aaron@deploy2002: aaron: Continuing with sync
* 20:08 aaron@deploy2002: aaron: Backport for [[gerrit:1249363{{!}}Remove redundant math spec file from wwwportal (T418188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:06 aaron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249363{{!}}Remove redundant math spec file from wwwportal (T418188)]]
* 20:01 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7016.*
* 19:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7001.magru.wmnet with OS trixie
* 19:51 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7016.magru.wmnet with OS trixie
* 19:49 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248911{{!}}Stop writing to il_to on commonswiki (T415787)]] (duration: 06m 04s)
* 19:45 zabe@deploy2002: zabe: Continuing with sync
* 19:44 zabe@deploy2002: zabe: Backport for [[gerrit:1248911{{!}}Stop writing to il_to on commonswiki (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:43 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248911{{!}}Stop writing to il_to on commonswiki (T415787)]]
* 19:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 19:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 19:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7001.magru.wmnet with reason: host reimage
* 19:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7016.magru.wmnet with reason: host reimage
* 19:23 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7001.magru.wmnet with reason: host reimage
* 19:19 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7016.magru.wmnet with reason: host reimage
* 19:15 cwhite@deploy2002: Finished deploy [performance/arc-lamp@aa8da8b]: {{Gerrit|Ie7e0355f89294a2927f9dbc28afec3a62d1752de}} (duration: 00m 08s)
* 19:15 cwhite@deploy2002: Started deploy [performance/arc-lamp@aa8da8b]: {{Gerrit|Ie7e0355f89294a2927f9dbc28afec3a62d1752de}}
* 19:14 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 19:14 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 19:05 herron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249365{{!}}udp2log: switch to new hosts]] (duration: 09m 38s)
* 19:03 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:03 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:01 herron@deploy2002: herron: Continuing with sync
* 19:00 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:00 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 18:59 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 18:59 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 18:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 18:59 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 18:57 herron@deploy2002: herron: Backport for [[gerrit:1249365{{!}}udp2log: switch to new hosts]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:57 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7001.magru.wmnet with OS trixie
* 18:55 herron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249365{{!}}udp2log: switch to new hosts]]
* 18:55 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7016.magru.wmnet with OS trixie
* 18:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 18:49 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 18:44 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 18:44 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 18:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:34 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:33 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 18:32 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 18:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:29 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:27 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:23 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 18:23 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 18:23 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 18:16 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 18:16 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 18:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1001.eqiad.wmnet
* 18:05 herron@deploy2002: Sync cancelled.
* 18:04 herron@deploy2002: herron: Backport for [[gerrit:1249361{{!}}Revert "udp2log: switch to new hosts"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:02 herron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249361{{!}}Revert "udp2log: switch to new hosts"]]
* 18:01 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1001.eqiad.wmnet
* 17:54 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:47 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:42 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:42 herron@deploy2002: Sync cancelled.
* 17:40 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:39 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:38 mutante: contint1003 - unable to get uptime Caused by: Cumin execution failed (exit_code=2) [101/240] - attempted manual powercycle - Initializing Firmware Interfaces... blank screen [[phab:T418544|T418544]]
* 17:34 mutante: contint1003.mgmt - racadm serveraction powercycle [[phab:T418544|T418544]] - not reacting
* 17:25 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:25 herron@deploy2002: herron: Backport for [[gerrit:1249332{{!}}udp2log: switch to new hosts (T417002)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:23 herron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249332{{!}}udp2log: switch to new hosts (T417002)]]
* 17:19 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host netflow4003.ulsfo.wmnet
* 17:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host netflow4003.ulsfo.wmnet with OS bookworm
* 17:13 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:08 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 17:03 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 17:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 17:00 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis kaiwiki in section s5
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow4003.ulsfo.wmnet with reason: host reimage
* 16:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on netflow4003.ulsfo.wmnet with reason: host reimage
* 16:37 moritzm: installing gnupg security updates
* 16:31 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host netflow4003.ulsfo.wmnet with OS bookworm
* 16:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netflow4003.ulsfo.wmnet on all recursors
* 16:30 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netflow4003.ulsfo.wmnet on all recursors
* 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:26 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host netflow4003.ulsfo.wmnet
* 16:26 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 15:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on prometheus4003.ulsfo.wmnet with reason: host reimage
* 15:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on prometheus4003.ulsfo.wmnet with reason: host reimage
* 15:44 vgutierrez: vgutierrez@acmechief-test2001:~$ sudo -i systemctl disable reload-acme-chief-backend.timer - [[phab:T419352|T419352]]
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 15:37 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 15:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 15:30 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 15:26 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 15:24 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 15:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
* 15:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
* 15:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 15:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
* 15:08 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 15:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
* 14:50 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bookworm
* 14:49 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs2009.codfw.wmnet with OS bullseye
* 14:45 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 14:35 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249291{{!}}Hide 2fa-warning Echo category from preferences (T419111)]] (duration: 06m 07s)
* 14:35 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis kaiwiki in section s5
* 14:34 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitize-wiki (exit_code=99) Managing sanitization for wikis urwikisource in section s5
* 14:31 mszwarc@deploy2002: mszwarc: Continuing with sync
* 14:31 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1249291{{!}}Hide 2fa-warning Echo category from preferences (T419111)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:30 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
* 14:29 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1249291{{!}}Hide 2fa-warning Echo category from preferences (T419111)]]
* 14:25 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis urwikisource in section s5
* 14:22 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis urwikisource in section s5
* 14:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 14:15 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 14:15 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249243{{!}}JS SDK: Add getExperimentByPrefix() (T419191)]], [[gerrit:1249242{{!}}ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191)]] (duration: 09m 39s)
* 14:11 phuedx@deploy2002: phuedx: Continuing with sync
* 14:07 phuedx@deploy2002: phuedx: Backport for [[gerrit:1249243{{!}}JS SDK: Add getExperimentByPrefix() (T419191)]], [[gerrit:1249242{{!}}ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:05 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1249243{{!}}JS SDK: Add getExperimentByPrefix() (T419191)]], [[gerrit:1249242{{!}}ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191)]]
* 14:03 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 13:54 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bullseye
* 13:50 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249262{{!}}Disable MetricsPlatform extension (T416865)]] (duration: 08m 02s)
* 13:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 13:46 phuedx@deploy2002: phuedx, sfaci: Continuing with sync
* 13:44 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:44 phuedx@deploy2002: phuedx, sfaci: Backport for [[gerrit:1249262{{!}}Disable MetricsPlatform extension (T416865)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1249262{{!}}Disable MetricsPlatform extension (T416865)]]
* 13:39 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248075{{!}}Confirmemail: Log delay between email sent and confirmation (T415902)]], [[gerrit:1247651{{!}}Enable confirmemail logstash channel (T415902)]] (duration: 11m 16s)
* 13:35 phuedx@deploy2002: mmartorana, phuedx: Continuing with sync
* 13:30 phuedx@deploy2002: mmartorana, phuedx: Backport for [[gerrit:1248075{{!}}Confirmemail: Log delay between email sent and confirmation (T415902)]], [[gerrit:1247651{{!}}Enable confirmemail logstash channel (T415902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1248075{{!}}Confirmemail: Log delay between email sent and confirmation (T415902)]], [[gerrit:1247651{{!}}Enable confirmemail logstash channel (T415902)]]
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 13:04 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 12:55 moritzm: installing Kerberos security updates
* 12:29 moritzm: installing python3.9 security updates
* 12:11 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 12:00 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1239026{{!}}Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544)]], [[gerrit:1249253{{!}}CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled]] (duration: 06m 13s)
* 11:56 reedy@deploy2002: reedy: Continuing with sync
* 11:56 reedy@deploy2002: reedy: Backport for [[gerrit:1239026{{!}}Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544)]], [[gerrit:1249253{{!}}CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:54 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1239026{{!}}Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544)]], [[gerrit:1249253{{!}}CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled]]
* 11:44 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249245{{!}}Hooks: Really only add global logging context for pageviews]] (duration: 12m 02s)
* 11:38 phuedx@deploy2002: phuedx: Continuing with sync
* 11:34 phuedx@deploy2002: phuedx: Backport for [[gerrit:1249245{{!}}Hooks: Really only add global logging context for pageviews]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:32 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1249245{{!}}Hooks: Really only add global logging context for pageviews]]
* 11:29 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 11:29 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 11:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 11:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:50 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:49 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:45 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus4003.ulsfo.wmnet
* 10:45 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:45 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:44 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus4003.ulsfo.wmnet on all recursors
* 10:43 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache prometheus4003.ulsfo.wmnet on all recursors
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:40 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:39 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:39 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host prometheus4003.ulsfo.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:33 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:17 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 10:12 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:51 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 09:46 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:43 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:40 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus4003.ulsfo.wmnet
* 09:40 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host prometheus4003.ulsfo.wmnet
* 09:31 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host frdb1008
* 09:31 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host frdb1008
* 09:29 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 09:05 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 08:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 08:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 08:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 08:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:25 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 08:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 08:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 08:21 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 08:16 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:07 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo and group 1
* 08:07 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo and group 1
* 07:37 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248806{{!}}Add a script to send mandatory 2FA Echo notification (T419111)]], [[gerrit:1248821{{!}}Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880)]] (duration: 34m 41s)
* 07:23 mszwarc@deploy2002: mszwarc: Continuing with sync
* 07:22 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1248806{{!}}Add a script to send mandatory 2FA Echo notification (T419111)]], [[gerrit:1248821{{!}}Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:02 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1248806{{!}}Add a script to send mandatory 2FA Echo notification (T419111)]], [[gerrit:1248821{{!}}Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 58s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-08 ==
* 20:28 vgutierrez@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on acmechief-test2001.codfw.wmnet with reason: GTS issues
* 02:01 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 00m 59s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-07 ==
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 23s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:20 krinkle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248952{{!}}CSP: restore toolforge/wmcs entry in false positive list]] (duration: 10m 46s)
* 01:16 krinkle@deploy2002: krinkle: Continuing with sync
* 01:11 krinkle@deploy2002: krinkle: Backport for [[gerrit:1248952{{!}}CSP: restore toolforge/wmcs entry in false positive list]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:09 krinkle@deploy2002: Started scap sync-world: Backport for [[gerrit:1248952{{!}}CSP: restore toolforge/wmcs entry in false positive list]]
* 00:22 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
* 00:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp2043.codfw.wmnet
* 00:05 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp2043.codfw.wmnet
== 2026-03-06 ==
* 23:29 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs2009.codfw.wmnet with OS bullseye
* 23:13 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 23:07 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs2009
* 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2009
* 22:46 ryankemper@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2009
* 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs2009.codfw.wmnet 141.0.192.10.in-addr.arpa 1.4.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:46 ryankemper@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs2009.codfw.wmnet 141.0.192.10.in-addr.arpa 1.4.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:45 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:45 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs2009 - ryankemper@cumin2002"
* 22:45 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs2009 - ryankemper@cumin2002"
* 22:41 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
* 22:40 ryankemper@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs2009
* 22:39 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bullseye
* 19:48 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:47 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:47 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:46 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host wdqs2009.codfw.wmnet
* 19:23 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2009.codfw.wmnet
* 19:17 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on wdqs2009.codfw.wmnet with reason: NFS might be hung, about to reboot
* 18:56 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2043.codfw.wmnet with reason: troubleshooting for network drops
* 18:44 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp2043.*
* 18:29 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts an-backup-datanode1033.eqiad.wmnet
* 18:29 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-backup-datanode1033.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
* 18:28 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-backup-datanode1033.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
* 17:59 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248858{{!}}cirrus: Use https for semanticsearch-test cluster]] (duration: 11m 20s)
* 17:53 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 17:52 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1248858{{!}}cirrus: Use https for semanticsearch-test cluster]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:51 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:51 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:47 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1248858{{!}}cirrus: Use https for semanticsearch-test cluster]]
* 17:42 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:42 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:40 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:40 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:11 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:11 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:10 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 17:05 hashar@deploy2002: Finished deploy [gerrit/gerrit@b8183ba]: wm-checks-api: add tooltip to the CheckRun Run action (duration: 00m 13s)
* 17:05 hashar@deploy2002: Started deploy [gerrit/gerrit@b8183ba]: wm-checks-api: add tooltip to the CheckRun Run action
* 17:04 btullis@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-backup-datanode1033.eqiad.wmnet
* 16:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 16:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 16:23 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 16:23 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 15:57 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:57 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:56 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:56 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:52 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2354-2356].codfw.wmnet
* 15:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2354-2356].codfw.wmnet
* 15:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2356.codfw.wmnet with OS trixie
* 15:46 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2355.codfw.wmnet with OS trixie
* 15:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2354.codfw.wmnet with OS trixie
* 15:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2356.codfw.wmnet with reason: host reimage
* 15:31 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 15:30 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 15:28 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 15:28 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 15:28 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 15:26 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 15:26 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 15:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2355.codfw.wmnet with reason: host reimage
* 15:24 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 15:23 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 15:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2354.codfw.wmnet with reason: host reimage
* 15:19 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 15:19 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 15:17 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 15:17 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 15:17 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2356.codfw.wmnet with reason: host reimage
* 15:16 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2355.codfw.wmnet with reason: host reimage
* 15:16 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2354.codfw.wmnet with reason: host reimage
* 15:15 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 15:10 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 15:09 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 15:08 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 15:08 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 15:06 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 15:05 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 15:05 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 15:05 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2356.codfw.wmnet with OS trixie
* 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2355.codfw.wmnet with OS trixie
* 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2354.codfw.wmnet with OS trixie
* 15:02 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 15:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2348-2353].codfw.wmnet
* 15:02 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 15:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2348-2353].codfw.wmnet
* 14:59 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2353.codfw.wmnet with OS trixie
* 14:57 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:57 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:56 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
* 14:53 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:52 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 14:52 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2351.codfw.wmnet with OS trixie
* 14:49 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 14:48 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2352.codfw.wmnet with OS trixie
* 14:48 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 14:48 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 14:48 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:47 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 14:45 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2350.codfw.wmnet with OS trixie
* 14:44 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 14:43 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:43 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 14:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:41 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2353.codfw.wmnet with reason: host reimage
* 14:37 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2349.codfw.wmnet with reason: host reimage
* 14:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2351.codfw.wmnet with reason: host reimage
* 14:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2352.codfw.wmnet with reason: host reimage
* 14:29 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:28 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2350.codfw.wmnet with reason: host reimage
* 14:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2348.codfw.wmnet with reason: host reimage
* 14:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2351.codfw.wmnet with reason: host reimage
* 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2352.codfw.wmnet with reason: host reimage
* 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2353.codfw.wmnet with reason: host reimage
* 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2350.codfw.wmnet with reason: host reimage
* 14:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2349.codfw.wmnet with reason: host reimage
* 14:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2348.codfw.wmnet with reason: host reimage
* 14:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2353.codfw.wmnet with OS trixie
* 14:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2352.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2351.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2350.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2347].codfw.wmnet
* 14:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2347].codfw.wmnet
* 14:01 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2347.codfw.wmnet with OS trixie
* 13:57 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2346.codfw.wmnet with OS trixie
* 13:55 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2343.codfw.wmnet with OS trixie
* 13:50 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2345.codfw.wmnet with OS trixie
* 13:48 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2344.codfw.wmnet with OS trixie
* 13:45 dreamyjazz@deploy2002: mwscript-k8s job started: foreachwikiindblist checkuser-suggested-investigations CheckUser:queueAutoCloseSICases.php # [[phab:T418591|T418591]]
* 13:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2342.codfw.wmnet with OS trixie
* 13:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2347.codfw.wmnet with reason: host reimage
* 13:38 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2346.codfw.wmnet with reason: host reimage
* 13:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2343.codfw.wmnet with reason: host reimage
* 13:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2345.codfw.wmnet with reason: host reimage
* 13:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2344.codfw.wmnet with reason: host reimage
* 13:24 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2342.codfw.wmnet with reason: host reimage
* 13:21 Dreamy_Jazz: Running foreachwikiindblist checkuser-suggested-investigations.dblist ~/PopulateSiuInfo.php --batch-size=1000 for [[phab:T411118|T411118]]
* 13:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2347.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2346.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2345.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2344.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2343.codfw.wmnet with reason: host reimage
* 13:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2342.codfw.wmnet with reason: host reimage
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2347.codfw.wmnet with OS trixie
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2346.codfw.wmnet with OS trixie
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2345.codfw.wmnet with OS trixie
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2344.codfw.wmnet with OS trixie
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2343.codfw.wmnet with OS trixie
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2342.codfw.wmnet with OS trixie
* 13:05 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2336-2341].codfw.wmnet
* 13:05 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2336-2341].codfw.wmnet
* 13:01 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2341.codfw.wmnet with OS trixie
* 12:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2340.codfw.wmnet with OS trixie
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2337.codfw.wmnet with OS trixie
* 12:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2338.codfw.wmnet with OS trixie
* 12:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2336.codfw.wmnet with OS trixie
* 12:40 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2341.codfw.wmnet with reason: host reimage
* 12:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2339.codfw.wmnet with OS trixie
* 12:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2340.codfw.wmnet with reason: host reimage
* 12:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2337.codfw.wmnet with reason: host reimage
* 12:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2338.codfw.wmnet with reason: host reimage
* 12:22 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2336.codfw.wmnet with reason: host reimage
* 12:18 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2339.codfw.wmnet with reason: host reimage
* 12:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2340.codfw.wmnet with reason: host reimage
* 12:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2341.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2337.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2338.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2336.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2339.codfw.wmnet with reason: host reimage
* 12:00 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2341.codfw.wmnet with OS trixie
* 12:00 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2340.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2339.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2338.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2337.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2336.codfw.wmnet with OS trixie
* 11:56 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2333-2335].codfw.wmnet
* 11:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2333-2335].codfw.wmnet
* 11:55 elukey@cumin1003: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 11:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1207.eqiad.wmnet
* 11:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2335.codfw.wmnet with OS trixie
* 11:53 moritzm: uploaded icu 72.1-3+deb12u1~wmf11u1 to component/php83-icu72 [[phab:T419058|T419058]] (backport of ICU 72 from Bookworm to Bullseye, built to be co-installable with the native ICU from Bullseye)
* 11:50 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2334.codfw.wmnet with OS trixie
* 11:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1207.eqiad.wmnet
* 11:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1205.eqiad.wmnet
* 11:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2333.codfw.wmnet with OS trixie
* 11:39 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 11:39 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1205.eqiad.wmnet
* 11:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2335.codfw.wmnet with reason: host reimage
* 11:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2334.codfw.wmnet with reason: host reimage
* 11:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2333.codfw.wmnet with reason: host reimage
* 11:23 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2335.codfw.wmnet with reason: host reimage
* 11:22 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2334.codfw.wmnet with reason: host reimage
* 11:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2333.codfw.wmnet with reason: host reimage
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2334.codfw.wmnet with OS trixie
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2335.codfw.wmnet with OS trixie
* 11:08 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2333.codfw.wmnet with OS trixie
* 11:06 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2332.codfw.wmnet
* 11:05 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2332.codfw.wmnet
* 11:02 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2332.codfw.wmnet with OS trixie
* 10:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2332.codfw.wmnet with reason: host reimage
* 10:36 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2332.codfw.wmnet with reason: host reimage
* 10:23 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2332.codfw.wmnet with OS trixie
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1199.eqiad.wmnet
* 10:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1194.eqiad.wmnet
* 10:16 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2356].codfw.wmnet
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1194.eqiad.wmnet
* 10:09 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
* 10:09 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2356].codfw.wmnet
* 09:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:39 Emperor: repool ms-fe1013 after PXE work [[phab:T401966|T401966]]
* 09:23 derick@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=pmswiki --logwiki=metawiki Wikilimes Limes.pink # [[phab:T419184|T419184]]
* 09:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:09 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:08 elukey@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 09:08 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe1013.eqiad.wmnet
* 08:57 elukey@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-fe1013.eqiad.wmnet
* 08:56 elukey@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 08:54 elukey@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 08:42 elukey@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 08:25 moritzm: uploaded openjdk-8 8u482-ga-1~deb12u1 to component/jdk8 of bookworm-wikimedia
* 08:11 moritzm: imported prometheus-ganeti-exporter 0.3+deb12u2 for bookworm-wikimedia [[phab:T419166|T419166]]
* 06:23 ryankemper@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
* 06:23 ryankemper@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
* 06:23 ryankemper@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
* 06:23 ryankemper@deploy2002: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
* 06:22 ryankemper@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
* 06:22 ryankemper@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
* 02:59 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:59 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 02:59 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 02:56 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 02:21 zabe: zabe@deploy2002:/srv/mediawiki-staging$ foreachwiki extensions/TimedMediaHandler/maintenance/migrateTranscodeStates.php --force # [[phab:T415064|T415064]]
* 02:16 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248658{{!}}Update interwiki cache]] (duration: 06m 38s)
* 02:12 zabe@deploy2002: mwscript-k8s job started: foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https # [[phab:T415978|T415978]], [[phab:T414241|T414241]]
* 02:12 zabe@deploy2002: zabe: Continuing with sync
* 02:11 zabe@deploy2002: zabe: Backport for [[gerrit:1248658{{!}}Update interwiki cache]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 02:09 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248658{{!}}Update interwiki cache]]
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 23s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:59 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248656{{!}}Set urwikisource to rtl (T415960)]] (duration: 06m 39s)
* 01:55 zabe@deploy2002: zabe: Continuing with sync
* 01:54 zabe@deploy2002: zabe: Backport for [[gerrit:1248656{{!}}Set urwikisource to rtl (T415960)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:53 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248656{{!}}Set urwikisource to rtl (T415960)]]
* 01:45 zabe@deploy2002: Sync cancelled.
* 01:43 zabe@deploy2002: zabe: Backport for [[gerrit:1248653{{!}}Activate urwikisource (T415960)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:42 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248653{{!}}Activate urwikisource (T415960)]]
* 01:38 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248652{{!}}Prepare urwikisource (T415960)]] (duration: 06m 18s)
* 01:34 zabe@deploy2002: zabe: Continuing with sync
* 01:34 zabe@deploy2002: zabe: Backport for [[gerrit:1248652{{!}}Prepare urwikisource (T415960)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:32 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248652{{!}}Prepare urwikisource (T415960)]]
* 01:29 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248650{{!}}Activate kaiwiki (T414234)]] (duration: 06m 57s)
* 01:25 zabe@deploy2002: zabe: Continuing with sync
* 01:24 zabe@deploy2002: zabe: Backport for [[gerrit:1248650{{!}}Activate kaiwiki (T414234)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:22 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248650{{!}}Activate kaiwiki (T414234)]]
* 01:17 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248647{{!}}Prepare kaiwiki (T414234)]] (duration: 07m 25s)
* 01:13 zabe@deploy2002: zabe: Continuing with sync
* 01:11 zabe@deploy2002: zabe: Backport for [[gerrit:1248647{{!}}Prepare kaiwiki (T414234)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:09 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248647{{!}}Prepare kaiwiki (T414234)]]
* 00:33 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248493{{!}}Stop writing to il_to on all wikis except commons (T415787)]] (duration: 06m 22s)
* 00:29 zabe@deploy2002: zabe: Continuing with sync
* 00:28 zabe@deploy2002: zabe: Backport for [[gerrit:1248493{{!}}Stop writing to il_to on all wikis except commons (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:27 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248493{{!}}Stop writing to il_to on all wikis except commons (T415787)]]
* 00:05 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248628{{!}}Re-enable AllowUserJs (T419137)]] (duration: 08m 08s)
* 00:01 catrope@deploy2002: catrope, kharlan: Continuing with sync
== 2026-03-05 ==
* 23:58 catrope@deploy2002: catrope, kharlan: Backport for [[gerrit:1248628{{!}}Re-enable AllowUserJs (T419137)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:56 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1248628{{!}}Re-enable AllowUserJs (T419137)]]
* 23:52 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248636{{!}}CSP: Update false positives list]] (duration: 06m 34s)
* 23:52 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2003.wikimedia.org with OS trixie
* 23:47 catrope@deploy2002: catrope: Continuing with sync
* 23:47 catrope@deploy2002: catrope: Backport for [[gerrit:1248636{{!}}CSP: Update false positives list]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:45 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1248636{{!}}CSP: Update false positives list]]
* 23:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint2003.wikimedia.org with reason: host reimage
* 23:29 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on contint2003.wikimedia.org with reason: host reimage
* 23:15 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1238028{{!}}Using Hadoop for MostTranscludedPages on commonswiki (T416927)]] (duration: 06m 27s)
* 23:11 zabe@deploy2002: zabe: Continuing with sync
* 23:10 zabe@deploy2002: zabe: Backport for [[gerrit:1238028{{!}}Using Hadoop for MostTranscludedPages on commonswiki (T416927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:09 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2003.wikimedia.org with OS trixie
* 23:08 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1238028{{!}}Using Hadoop for MostTranscludedPages on commonswiki (T416927)]]
* 22:45 maryum: Deployed security fix for [[phab:T418254|T418254]]
* 22:35 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248483{{!}}SpecialWantedFiles: Use lt_title instead of lt_to (T299953)]] (duration: 06m 12s)
* 22:31 zabe@deploy2002: zabe: Continuing with sync
* 22:30 zabe@deploy2002: zabe: Backport for [[gerrit:1248483{{!}}SpecialWantedFiles: Use lt_title instead of lt_to (T299953)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:28 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248483{{!}}SpecialWantedFiles: Use lt_title instead of lt_to (T299953)]]
* 21:43 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248508{{!}}cirrus: Align semanticsearch cluster group name with routing (T413969)]] (duration: 07m 20s)
* 21:39 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 21:38 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1248508{{!}}cirrus: Align semanticsearch cluster group name with routing (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:36 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1248508{{!}}cirrus: Align semanticsearch cluster group name with routing (T413969)]]
* 21:04 jhathaway@dns1004: END - running authdns-update
* 21:02 jhathaway@dns1004: START - running authdns-update
* 20:53 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:52 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new service IPs for sophroid - jasmine@cumin2002"
* 20:52 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new service IPs for sophroid - jasmine@cumin2002"
* 20:47 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 20:28 cdanis: apt built and imported jwt-authorizer 1.3.0-1
* 20:16 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 20:04 krinkle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248574{{!}}Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475)]] (duration: 07m 37s)
* 20:00 krinkle@deploy2002: krinkle: Continuing with sync
* 19:58 krinkle@deploy2002: krinkle: Backport for [[gerrit:1248574{{!}}Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:56 krinkle@deploy2002: Started scap sync-world: Backport for [[gerrit:1248574{{!}}Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475)]]
* 19:21 sbassett@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248571{{!}}Re-enable Site JS (T419137 T419138)]] (duration: 06m 57s)
* 19:17 sbassett@deploy2002: sbassett: Continuing with sync
* 19:16 sbassett@deploy2002: sbassett: Backport for [[gerrit:1248571{{!}}Re-enable Site JS (T419137 T419138)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:15 sbassett@deploy2002: Started scap sync-world: Backport for [[gerrit:1248571{{!}}Re-enable Site JS (T419137 T419138)]]
* 19:04 dr0ptp4kt: Deploying change {{Gerrit|1239200}} for refinery ( [[phab:T416481|T416481]] ) using scap, then deployed onto hdfs
* 19:03 dr0ptp4kt: Deployed refinery change {{Gerrit|1240253}} ( [[phab:T414478|T414478]] ), {{Gerrit|1240253}} (no-op) for refinery ( [[phab:T414478|T414478]] ) using scap, then deployed onto hdfs
* 18:58 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1] (thin): Regular analytics weekly train THIN [analytics/refinery@dd641b15] (duration: 02m 02s)
* 18:56 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1] (thin): Regular analytics weekly train THIN [analytics/refinery@dd641b15]
* 18:55 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1]: Regular analytics weekly train [analytics/refinery@dd641b15] (duration: 04m 18s)
* 18:50 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1]: Regular analytics weekly train [analytics/refinery@dd641b15]
* 18:49 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@dd641b15] (duration: 01m 57s)
* 18:47 dr0ptp4kt: Deploying change {{Gerrit|1239200}} for refinery ( [[phab:T416481|T416481]] )
* 18:47 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@dd641b15]
* 18:31 eevans@dns1004: END - running authdns-update
* 18:30 eevans@dns1004: START - running authdns-update
* 18:30 sukhe: sudo cumin -b51 "A:cp" "run-puppet-agent --enable 'rolling out 1248544'"
* 18:16 sukhe: sudo cumin "A:cp" "disable-puppet 'rolling out 1248544'"
* 18:06 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:06 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 18:06 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 18:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 17:31 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248536{{!}}Enable wgUseSiteJs on donatewiki (T419138)]] (duration: 09m 57s)
* 17:27 mszwarc@deploy2002: mszwarc, krinkle: Continuing with sync
* 17:26 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2003.wikimedia.org with OS bookworm
* 17:26 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:23 mszwarc@deploy2002: mszwarc, krinkle: Backport for [[gerrit:1248536{{!}}Enable wgUseSiteJs on donatewiki (T419138)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:21 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1248536{{!}}Enable wgUseSiteJs on donatewiki (T419138)]]
* 17:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1178.eqiad.wmnet
* 17:16 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:12 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1162.eqiad.wmnet
* 17:12 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1162.eqiad.wmnet
* 17:10 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker1162.eqiad.wmnet
* 17:10 cgoubert@cumin1003: START - Cookbook sre.hosts.remove-downtime for wikikube-worker1162.eqiad.wmnet
* 17:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 17:05 taavi@cumin1003: dbctl commit (dc=all): 'enable writes', diff saved to https://phabricator.wikimedia.org/P89812 and previous config saved to /var/cache/conftool/dbconfig/20260305-170556-taavi.json
* 16:03 oblivian@cumin1003: dbctl commit (dc=all): 'read only s6', diff saved to https://phabricator.wikimedia.org/P89810 and previous config saved to /var/cache/conftool/dbconfig/20260305-160348-oblivian.json
* 15:32 taavi@cumin1003: dbctl commit (dc=all): 'set global ro', diff saved to https://phabricator.wikimedia.org/P89808 and previous config saved to /var/cache/conftool/dbconfig/20260305-153203-taavi.json
* 15:31 mszwarc@deploy2002: mszwarc: Continuing with sync
* 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1178.eqiad.wmnet
* 15:31 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1248509{{!}}Disable custom JS for a moment]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:29 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1248509{{!}}Disable custom JS for a moment]]
* 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2003']
* 15:25 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2003']
* 15:23 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248506{{!}}cirrus: Correct semantic builder config (T413969)]] (duration: 07m 39s)
* 15:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:19 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 15:18 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1248506{{!}}cirrus: Correct semantic builder config (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:16 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1248506{{!}}cirrus: Correct semantic builder config (T413969)]]
* 15:11 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 15:10 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244713{{!}}cirrus: Add semantic search test cluster (T413969)]] (duration: 09m 18s)
* 15:06 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 15:04 sukhe@dns1004: END - running authdns-update
* 15:03 sukhe@dns1004: START - running authdns-update
* 15:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:02 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1244713{{!}}cirrus: Add semantic search test cluster (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:02 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
* 15:02 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
* 15:00 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1244713{{!}}cirrus: Add semantic search test cluster (T413969)]]
* 14:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:53 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 14:50 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 14:38 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
* 14:38 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
* 14:32 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
* 14:32 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
* 14:32 sukhe@dns1004: END - running authdns-update
* 14:30 sukhe@dns1004: START - running authdns-update
* 14:28 elukey@cumin1003: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 14:28 sukhe@dns1004: START - running authdns-update
* 14:27 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1231.eqiad.wmnet
* 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1230.eqiad.wmnet
* 14:24 bking@dns1004: START - running authdns-update
* 14:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1230.eqiad.wmnet
* 14:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1229.eqiad.wmnet
* 14:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 14:05 moritzm: imported nodejs 24.14.0-1nodesource1 to thirdparty/node24 [[phab:T418440|T418440]]
* 14:03 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1229.eqiad.wmnet
* 14:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1228.eqiad.wmnet
* 14:01 moritzm: initialised ganeti02/ulsfo cluster [[phab:T418993|T418993]]
* 13:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:53 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:52 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1228.eqiad.wmnet
* 13:52 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1227.eqiad.wmnet
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:46 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:43 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:42 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1199.eqiad.wmnet
* 13:40 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1227.eqiad.wmnet
* 13:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1226.eqiad.wmnet
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:35 moritzm: installing glib2.0 security updates
* 13:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:26 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1226.eqiad.wmnet
* 13:26 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1225.eqiad.wmnet
* 13:26 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1225.eqiad.wmnet
* 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1224.eqiad.wmnet
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4006.ulsfo.wmnet with OS bookworm
* 13:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:07 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new VIP for routed ganeti in ulsfo - jmm@cumin2002"
* 13:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new VIP for routed ganeti in ulsfo - jmm@cumin2002"
* 13:06 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:02 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1224.eqiad.wmnet
* 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1223.eqiad.wmnet
* 13:00 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:58 cgoubert@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on wikikube-worker1162.eqiad.wmnet with reason: dcops intervention
* 12:57 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1162.eqiad.wmnet
* 12:56 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1162.eqiad.wmnet
* 12:55 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 12:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1223.eqiad.wmnet
* 12:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1222.eqiad.wmnet
* 12:46 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 12:43 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 12:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1222.eqiad.wmnet
* 12:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1221.eqiad.wmnet
* 12:23 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1221.eqiad.wmnet
* 12:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1220.eqiad.wmnet
* 12:23 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4006.ulsfo.wmnet with OS bookworm
* 12:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1220.eqiad.wmnet
* 11:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4005.ulsfo.wmnet
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4005.ulsfo.wmnet
* 11:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1236.eqiad.wmnet
* 11:29 moritzm: remove ganeti4006 from ganeti/ulsfo cluster [[phab:T418993|T418993]]
* 11:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1236.eqiad.wmnet
* 11:25 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1235.eqiad.wmnet
* 11:16 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1235.eqiad.wmnet
* 11:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1234.eqiad.wmnet
* 11:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1234.eqiad.wmnet
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1233.eqiad.wmnet
* 11:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1233.eqiad.wmnet
* 11:02 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1232.eqiad.wmnet
* 11:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 11:00 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4005.ulsfo.wmnet with OS bookworm
* 10:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 10:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1232.eqiad.wmnet
* 10:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1231.eqiad.wmnet
* 10:54 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 10:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1231.eqiad.wmnet
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1230.eqiad.wmnet
* 10:41 elukey@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4005.ulsfo.wmnet with reason: host reimage
* 10:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1230.eqiad.wmnet
* 10:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1229.eqiad.wmnet
* 10:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4005.ulsfo.wmnet with reason: host reimage
* 10:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1229.eqiad.wmnet
* 10:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1228.eqiad.wmnet
* 10:24 moritzm: installing Java 8 security updates
* 10:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1228.eqiad.wmnet
* 10:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1227.eqiad.wmnet
* 10:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1227.eqiad.wmnet
* 10:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1226.eqiad.wmnet
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 10:11 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4005.ulsfo.wmnet with OS bookworm
* 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ganeti4005.ulsfo.wmnet
* 10:08 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ganeti4005.ulsfo.wmnet
* 10:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add gw-virtual.ulsfo.wmnet - ayounsi@cumin1003"
* 10:07 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1226.eqiad.wmnet
* 10:07 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1225.eqiad.wmnet
* 09:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1225.eqiad.wmnet
* 09:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1224.eqiad.wmnet
* 09:52 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1224.eqiad.wmnet
* 09:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1223.eqiad.wmnet
* 09:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1223.eqiad.wmnet
* 09:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1222.eqiad.wmnet
* 09:43 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add gw-virtual.ulsfo.wmnet - ayounsi@cumin1003"
* 09:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1222.eqiad.wmnet
* 09:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1221.eqiad.wmnet
* 09:32 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:32 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:28 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1221.eqiad.wmnet
* 09:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1220.eqiad.wmnet
* 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1220.eqiad.wmnet
* 09:02 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 08:38 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247990{{!}}Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580)]] (duration: 07m 07s)
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/airflow-main: apply
* 08:34 mszwarc@deploy2002: mszwarc: Continuing with sync
* 08:33 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1247990{{!}}Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:30 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1247990{{!}}Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580)]]
* 08:29 gehel@dns1004: END - running authdns-update
* 08:28 gehel@dns1004: START - running authdns-update
* 08:27 moritzm: installing mbedtls security updates
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 08:15 hashar@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248314{{!}}Revert "zhwiki: Add 2026 CNY celebration logos"]] (duration: 09m 19s)
* 08:11 hashar@deploy2002: hashar, stang: Continuing with sync
* 08:08 hashar@deploy2002: hashar, stang: Backport for [[gerrit:1248314{{!}}Revert "zhwiki: Add 2026 CNY celebration logos"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:06 hashar@deploy2002: Started scap sync-world: Backport for [[gerrit:1248314{{!}}Revert "zhwiki: Add 2026 CNY celebration logos"]]
* 08:02 moritzm: uploaded openjdk-8 8u482-ga-1~deb11u1 to component/jdk8 of bullseye-wikimedia
* 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts bast4005.wikimedia.org
* 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast4005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast4005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:48 moritzm: uploaded bird2 2.18-1~wmf13u2 to the main component of trixie-wikimedia [[phab:T413740|T413740]]
* 07:47 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:47 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 07:42 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts bast4005.wikimedia.org
* 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Remove es1033 [[phab:T408772|T408772]]', diff saved to https://phabricator.wikimedia.org/P89804 and previous config saved to /var/cache/conftool/dbconfig/20260305-063548-marostegui.json
* 02:10 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 55s)
* 02:02 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 02:01 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248163{{!}}Stop writing to il_to on medium size wikis (T415787)]] (duration: 06m 14s)
* 01:58 zabe@deploy2002: zabe: Continuing with sync
* 01:57 zabe@deploy2002: zabe: Backport for [[gerrit:1248163{{!}}Stop writing to il_to on medium size wikis (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:55 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248163{{!}}Stop writing to il_to on medium size wikis (T415787)]]
* 01:40 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1246099{{!}}Start reading from new file tables on medium wikis (T416548)]] (duration: 06m 15s)
* 01:36 zabe@deploy2002: zabe: Continuing with sync
* 01:36 zabe@deploy2002: zabe: Backport for [[gerrit:1246099{{!}}Start reading from new file tables on medium wikis (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:34 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1246099{{!}}Start reading from new file tables on medium wikis (T416548)]]
* 01:29 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248154{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]], [[gerrit:1248153{{!}}Revert^2 "ImageListPager: Properly support file schema migration read new"]] (duration: 07m 21s)
* 01:25 zabe@deploy2002: zabe: Continuing with sync
* 01:23 zabe@deploy2002: zabe: Backport for [[gerrit:1248154{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]], [[gerrit:1248153{{!}}Revert^2 "ImageListPager: Properly support file schema migration read new"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:21 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248154{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]], [[gerrit:1248153{{!}}Revert^2 "ImageListPager: Properly support file schema migration read new"]]
* 00:55 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248021{{!}}Stop writing to il_to on small wikis (T415787)]] (duration: 06m 49s)
* 00:51 zabe@deploy2002: zabe: Continuing with sync
* 00:50 zabe@deploy2002: zabe: Backport for [[gerrit:1248021{{!}}Stop writing to il_to on small wikis (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:48 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248021{{!}}Stop writing to il_to on small wikis (T415787)]]
* 00:19 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248125{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]], [[gerrit:1248123{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]] (duration: 08m 52s)
* 00:13 zabe@deploy2002: zabe: Continuing with sync
* 00:12 zabe@deploy2002: zabe: Backport for [[gerrit:1248125{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]], [[gerrit:1248123{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:10 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248125{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]], [[gerrit:1248123{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]]
== 2026-03-04 ==
* 22:57 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 22:56 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 22:55 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 22:55 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 22:55 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 22:54 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 22:35 tgr_: UTC late deploys done
* 22:33 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248084{{!}}Introduce a Semantic Search query route and builder (T413969)]], [[gerrit:1248085{{!}}Wire up semantic query building (T413969)]] (duration: 38m 28s)
* 22:16 tgr@deploy2002: tgr, ebernhardson: Continuing with sync
* 22:14 tgr@deploy2002: tgr, ebernhardson: Backport for [[gerrit:1248084{{!}}Introduce a Semantic Search query route and builder (T413969)]], [[gerrit:1248085{{!}}Wire up semantic query building (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248084{{!}}Introduce a Semantic Search query route and builder (T413969)]], [[gerrit:1248085{{!}}Wire up semantic query building (T413969)]]
* 21:48 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248012{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999)]] (duration: 07m 05s)
* 21:47 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on dse-k8s-worker1028.eqiad.wmnet with reason: broken networking
* 21:44 tgr@deploy2002: tgr: Continuing with sync
* 21:43 tgr@deploy2002: tgr: Backport for [[gerrit:1248012{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:40 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248012{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999)]]
* 21:36 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248081{{!}}Add synthetic AAA experiment (T418614)]], [[gerrit:1248080{{!}}Add synthetic AAA experiment (T418614)]] (duration: 09m 11s)
* 21:35 bking@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
* 21:32 tgr@deploy2002: cjming, tgr: Continuing with sync
* 21:30 bking@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
* 21:29 tgr@deploy2002: cjming, tgr: Backport for [[gerrit:1248081{{!}}Add synthetic AAA experiment (T418614)]], [[gerrit:1248080{{!}}Add synthetic AAA experiment (T418614)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:27 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248081{{!}}Add synthetic AAA experiment (T418614)]], [[gerrit:1248080{{!}}Add synthetic AAA experiment (T418614)]]
* 21:21 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1245473{{!}}logging: set poolcounter channel log level to info (T418612)]] (duration: 09m 04s)
* 21:17 tgr@deploy2002: tgr, cwhite: Continuing with sync
* 21:14 tgr@deploy2002: tgr, cwhite: Backport for [[gerrit:1245473{{!}}logging: set poolcounter channel log level to info (T418612)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:12 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1245473{{!}}logging: set poolcounter channel log level to info (T418612)]]
* 21:07 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248007{{!}}Fix $wgJwtSessionCookieIssuer (T415007 T418999)]] (duration: 09m 55s)
* 21:03 tgr@deploy2002: tgr: Continuing with sync
* 20:59 tgr@deploy2002: tgr: Backport for [[gerrit:1248007{{!}}Fix $wgJwtSessionCookieIssuer (T415007 T418999)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:57 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248007{{!}}Fix $wgJwtSessionCookieIssuer (T415007 T418999)]]
* 19:56 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 19:44 jhuneidi@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248011{{!}}CategoryViewer: Fall back to empty string in case of missing nextpage (T418934)]] (duration: 10m 47s)
* 19:44 cdobbins@cumin2002: conftool action : set/pooled=yes:weight=1; selector: name=cp205[0-8].codfw.wmnet
* 19:43 cdobbins@cumin2002: conftool action : set/pooled=yes:weight=1; selector: name=cp2049.codfw.wmnet
* 19:40 jhuneidi@deploy2002: zabe, jhuneidi: Continuing with sync
* 19:35 jhuneidi@deploy2002: zabe, jhuneidi: Backport for [[gerrit:1248011{{!}}CategoryViewer: Fall back to empty string in case of missing nextpage (T418934)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:34 brett@puppetserver1001: conftool action : set/weight=1; selector: name=cp2043.*
* 19:34 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
* 19:33 jhuneidi@deploy2002: Started scap sync-world: Backport for [[gerrit:1248011{{!}}CategoryViewer: Fall back to empty string in case of missing nextpage (T418934)]]
* 19:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2043.codfw.wmnet with OS trixie
* 19:23 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 19:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 19:22 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 19:22 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 19:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2043.codfw.wmnet with reason: host reimage
* 19:06 brett@puppetserver1001: conftool action : set/weight=1; selector: name=cp204[45678].*
* 19:04 brett@puppetserver1001: conftool action : set/weight=100; selector: name=cp204[45678].*
* 19:02 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2043.codfw.wmnet with reason: host reimage
* 18:58 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp204[45678].*
* 18:52 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 18:51 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 18:50 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 18:50 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 18:49 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:49 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:49 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 18:48 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 18:48 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 18:47 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 18:47 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 18:47 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS trixie
* 18:46 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 18:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 18:42 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 18:41 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 18:41 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 18:39 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 18:39 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 18:38 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 18:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 18:37 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:32 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:16 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:16 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:16 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:15 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:15 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:14 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:14 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:13 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:12 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:12 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2047.codfw.wmnet with OS trixie
* 17:33 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 17:27 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 17:23 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:23 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:18 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:18 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:15 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:13 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:09 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp2047.codfw.wmnet with OS trixie
* 16:55 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:55 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:54 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:54 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1007.eqiad.wmnet with OS bookworm
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-unlock-scap (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:39 root@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Datacenter switchover from eqiad to codfw - [[phab:T418133|T418133]] (duration: 25m 37s)
* 16:39 root@deploy2002: Forcefully removing global lock: Datacenter switchover from eqiad to codfw - [[phab:T418133|T418133]]
* 16:39 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-unlock-scap for datacenter switchover from eqiad to codfw
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:27 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters for datacenter switchover from eqiad to codfw
* 16:27 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-restore-ttl (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:26 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-restore-ttl for datacenter switchover from eqiad to codfw
* 16:26 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:26 root@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 16:26 root@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 16:26 root@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 16:26 root@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 16:26 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance for datacenter switchover from eqiad to codfw
* 16:25 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:25 root@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: sync
* 16:25 root@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: sync
* 16:25 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: [DRY-RUN] MediaWiki read-only period ends at: 2026-03-04 16:24:40.502004
* 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.07-set-readwrite for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki for datacenter switchover from eqiad to codfw
* 16:23 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:23 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.03-set-db-readonly for datacenter switchover from eqiad to codfw
* 16:23 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:22 blake@cumin1003: [DRY-RUN] MediaWiki read-only period starts at: 2026-03-04 16:22:41.755892
* 16:22 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.02-set-readonly for datacenter switchover from eqiad to codfw
* 16:20 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 16:20 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:20 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance for datacenter switchover from eqiad to codfw
* 16:19 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:14 moritzm: upgrading cloudservices* to Bird 2.18 [[phab:T413740|T413740]]
* 16:14 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl for datacenter switchover from eqiad to codfw
* 16:13 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-lock-scap (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:13 root@deploy2002: Locking from deployment [ALL REPOSITORIES]: Datacenter switchover from eqiad to codfw - [[phab:T418133|T418133]]
* 16:13 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-lock-scap for datacenter switchover from eqiad to codfw
* 16:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:10 moritzm: remove ganeti4005 from ganeti/ulsfo cluster [[phab:T418993|T418993]]
* 16:10 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1007.eqiad.wmnet with OS bookworm
* 16:06 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:06 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks for datacenter switchover from eqiad to codfw
* 15:59 XioNoX: push pfw policies - [[phab:T418402|T418402]]
* 15:37 sukhe@dns1004: END - running authdns-update
* 15:36 sukhe@dns1004: START - running authdns-update
* 15:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1219.eqiad.wmnet
* 15:32 aqu@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 15:31 aqu@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 15:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
* 15:29 cgoubert@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>ms-fe10[14-24].*<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 15:24 cgoubert@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P<nowiki>{</nowiki>ms-fe10[14-24].*<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 15:22 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:22 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:22 cgoubert@cumin1003: END (ERROR) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=97) rolling restart_daemons on A:swift-fe-eqiad
* 15:21 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1219.eqiad.wmnet
* 15:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1218.eqiad.wmnet
* 15:19 cgoubert@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
* 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1120.eqiad.wmnet
* 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1121.eqiad.wmnet
* 15:16 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1115.eqiad.wmnet [reason: [[phab:T418772|T418772]] - BGP maintenance]
* 15:16 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1122.eqiad.wmnet
* 15:15 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:15 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:14 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:13 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:13 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:10 XioNoX: lsw1-d7-eqiad# tools network-instance default protocols bgp neighbor 10.64.128.17 reset-peer - [[phab:T418772|T418772]]
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
* 15:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1218.eqiad.wmnet
* 15:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1217.eqiad.wmnet
* 15:09 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:05 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:05 moritzm: upgrading cloudlb* to Bird 2.18 [[phab:T413740|T413740]]
* 15:05 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:04 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:58 Dreamy_Jazz: Afternoon UTC backport window done
* 14:58 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244373{{!}}zhwiki: Remove all rights from accountcreator (T418089)]] (duration: 08m 12s)
* 14:57 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:57 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1217.eqiad.wmnet
* 14:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1216.eqiad.wmnet
* 14:57 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:56 btullis@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on dse-k8s-worker[1010-1011,1013,1018-1019].eqiad.wmnet with reason: Adding 10 Gbps NIC
* 14:54 dreamyjazz@deploy2002: dreamyjazz, 1f616emo: Continuing with sync
* 14:52 jmm@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
* 14:52 dreamyjazz@deploy2002: dreamyjazz, 1f616emo: Backport for [[gerrit:1244373{{!}}zhwiki: Remove all rights from accountcreator (T418089)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:50 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1244373{{!}}zhwiki: Remove all rights from accountcreator (T418089)]]
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1216.eqiad.wmnet
* 14:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1215.eqiad.wmnet
* 14:44 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248009{{!}}Hooks: Fix liquidthreads log type definition bugs (T417425 T419006)]], [[gerrit:1248008{{!}}Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425)]] (duration: 07m 11s)
* 14:44 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1115.eqiad.wmnet [reason: [[phab:T418772|T418772]] - BGP maintenance]
* 14:44 taavi: updating CR firewall policy with https://gerrit.wikimedia.org/r/c/operations/homer/public/+/970275
* 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1122.eqiad.wmnet
* 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1121.eqiad.wmnet
* 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1120.eqiad.wmnet
* 14:40 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 14:39 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1248009{{!}}Hooks: Fix liquidthreads log type definition bugs (T417425 T419006)]], [[gerrit:1248008{{!}}Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:37 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1248009{{!}}Hooks: Fix liquidthreads log type definition bugs (T417425 T419006)]], [[gerrit:1248008{{!}}Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425)]]
* 14:33 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1215.eqiad.wmnet
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1214.eqiad.wmnet
* 14:32 btullis@puppetserver1001: conftool action : get/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1025.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1025.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
* 14:30 btullis@puppetserver1001: conftool action : get/pooled; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
* 14:29 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_esams and A:cp - 3.0 upgrade ()
* 14:27 arnaudb@dns1004: END - running authdns-update
* 14:26 arnaudb@dns1004: START - running authdns-update
* 14:26 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248000{{!}}Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999)]] (duration: 07m 19s)
* 14:22 tgr@deploy2002: tgr: Continuing with sync
* 14:21 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1214.eqiad.wmnet
* 14:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1213.eqiad.wmnet
* 14:21 tgr@deploy2002: tgr: Backport for [[gerrit:1248000{{!}}Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:19 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248000{{!}}Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999)]]
* 14:14 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247566{{!}}Enable new HTML confirmation emails for all (T416748)]] (duration: 07m 46s)
* 14:13 elukey@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:13 elukey@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:10 sgimeno@deploy2002: migr, sgimeno: Continuing with sync
* 14:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1213.eqiad.wmnet
* 14:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1212.eqiad.wmnet
* 14:09 sgimeno@deploy2002: migr, sgimeno: Backport for [[gerrit:1247566{{!}}Enable new HTML confirmation emails for all (T416748)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 elukey@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 14:08 elukey@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 14:08 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:07 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:07 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1247566{{!}}Enable new HTML confirmation emails for all (T416748)]]
* 13:59 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
* 13:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
* 13:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
* 13:57 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1212.eqiad.wmnet
* 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1211.eqiad.wmnet
* 13:49 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_esams and A:cp - 3.0 upgrade ()
* 13:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1211.eqiad.wmnet
* 13:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1210.eqiad.wmnet
* 13:43 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_esams and A:cp - 3.0 upgrade ()
* 13:40 arnaudb@dns1004: END - running authdns-update
* 13:39 arnaudb@dns1004: START - running authdns-update
* 13:37 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 13:33 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1210.eqiad.wmnet
* 13:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1209.eqiad.wmnet
* 13:20 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1209.eqiad.wmnet
* 13:20 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1208.eqiad.wmnet
* 13:17 aokoth@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:17 aokoth@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:16 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:15 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1208.eqiad.wmnet
* 13:06 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1207.eqiad.wmnet
* 13:03 arnaudb@dns1005: END - running authdns-update
* 13:02 arnaudb@dns1005: START - running authdns-update
* 13:00 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_esams and A:cp - 3.0 upgrade ()
* 13:00 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 12:46 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 12:45 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 12:44 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 12:44 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
* 12:43 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 12:43 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
* 12:33 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 12:29 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
* 12:10 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 12:08 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp - 3.0 upgrade ()
* 12:03 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1207.eqiad.wmnet
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1206.eqiad.wmnet
* 11:53 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1206.eqiad.wmnet
* 11:53 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1205.eqiad.wmnet
* 11:36 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f8-eqiad
* 11:36 jmm@cumin2002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
* 11:34 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp - 3.0 upgrade ()
* 11:34 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 11:28 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247968{{!}}SI: Update instrumentation schema (T418293)]] (duration: 16m 22s)
* 11:22 fabfur: start upgrading haproxy to 3.0 on A:cp-eqiad ([[phab:T417253|T417253]])
* 11:22 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 11:17 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1247968{{!}}SI: Update instrumentation schema (T418293)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:13 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp - 3.0 upgrade ()
* 11:12 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1247968{{!}}SI: Update instrumentation schema (T418293)]]
* 11:08 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp - 3.0 upgrade ()
* 11:07 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 11:07 blake@cumin1003: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 11:06 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2356].codfw.wmnet
* 11:06 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2356].codfw.wmnet
* 11:03 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 11:03 blake@cumin1003: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 10:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2356].codfw.wmnet
* 10:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2356].codfw.wmnet
* 10:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1205.eqiad.wmnet
* 10:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1204.eqiad.wmnet
* 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:42 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1204.eqiad.wmnet
* 10:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1203.eqiad.wmnet
* 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1203.eqiad.wmnet
* 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1202.eqiad.wmnet
* 10:25 fabfur: start upgrading haproxy to 3.0 on A:cp-drmrs ([[phab:T417253|T417253]])
* 10:25 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp - 3.0 upgrade ()
* 10:25 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp - 3.0 upgrade ()
* 10:24 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247941{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]], [[gerrit:1247944{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]] (duration: 06m 42s)
* 10:22 arnaudb@dns1004: END - running authdns-update
* 10:20 arnaudb@dns1004: START - running authdns-update
* 10:20 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 10:20 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1247941{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]], [[gerrit:1247944{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:18 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1247941{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]], [[gerrit:1247944{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]]
* 10:16 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1202.eqiad.wmnet
* 10:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1201.eqiad.wmnet
* 10:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:04 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1201.eqiad.wmnet
* 10:04 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1200.eqiad.wmnet
* 09:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1200.eqiad.wmnet
* 09:39 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247925{{!}}Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880)]] (duration: 08m 23s)
* 09:36 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_codfw and A:cp - 3.0 upgrade ()
* 09:35 mszwarc@deploy2002: mszwarc: Continuing with sync
* 09:33 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1247925{{!}}Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet
* 09:31 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp - 3.0 upgrade ()
* 09:31 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1247925{{!}}Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880)]]
* 09:30 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:03 gehel: switching off Blazegraph on wdqs2009 (legacy full graph endpoint is end of life) - [[phab:T411410|T411410]] / [[phab:T415073|T415073]]
* 09:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:02 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet
* 09:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 08:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 08:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:56 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet
* 08:54 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 08:52 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 08:49 topranks: disabling IBGP session between ssw1-d1-eqiad and ssw1-d8-eqiad to remove backup paths try #2 [[phab:T411054|T411054]]
* 08:36 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on backup1007.eqiad.wmnet,dbprov1004.eqiad.wmnet with reason: network maintenance
* 08:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:31 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:21 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp - 3.0 upgrade ()
* 08:21 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_codfw and A:cp - 3.0 upgrade ()
* 08:11 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5032.*
* 07:54 topranks: disabling IBGP session between ssw1-d1-eqiad and ssw1-d8-eqiad to remove backup paths [[phab:T411054|T411054]]
* 07:43 moritzm: installing libbpf updates from Bookworm point release
* 05:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 05:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 04s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 01:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89793 and previous config saved to /var/cache/conftool/dbconfig/20260304-015657-marostegui.json
* 01:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P89792 and previous config saved to /var/cache/conftool/dbconfig/20260304-014150-marostegui.json
* 01:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P89791 and previous config saved to /var/cache/conftool/dbconfig/20260304-012642-marostegui.json
* 01:23 zabe@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 01:22 zabe@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89790 and previous config saved to /var/cache/conftool/dbconfig/20260304-011134-marostegui.json
* 00:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1263 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89789 and previous config saved to /var/cache/conftool/dbconfig/20260304-004638-marostegui.json
* 00:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1263.eqiad.wmnet with reason: Maintenance
* 00:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89788 and previous config saved to /var/cache/conftool/dbconfig/20260304-004615-marostegui.json
* 00:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P89787 and previous config saved to /var/cache/conftool/dbconfig/20260304-003107-marostegui.json
* 00:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P89786 and previous config saved to /var/cache/conftool/dbconfig/20260304-001559-marostegui.json
* 00:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89785 and previous config saved to /var/cache/conftool/dbconfig/20260304-000052-marostegui.json
== 2026-03-03 ==
* 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1262 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89784 and previous config saved to /var/cache/conftool/dbconfig/20260303-233500-marostegui.json
* 23:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1262.eqiad.wmnet with reason: Maintenance
* 23:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89783 and previous config saved to /var/cache/conftool/dbconfig/20260303-233436-marostegui.json
* 23:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P89782 and previous config saved to /var/cache/conftool/dbconfig/20260303-231929-marostegui.json
* 23:10 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 23:08 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 23:08 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:07 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:05 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 23:05 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 23:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P89781 and previous config saved to /var/cache/conftool/dbconfig/20260303-230421-marostegui.json
* 23:04 bking@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
* 23:02 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247689{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247690{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247596{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007)]] (duration: 21m 47s)
* 23:00 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7008.magru.wmnet [reason: lldpd packet drop issues]
* 22:58 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7008 [reason: lldpd packet drop issues]
* 22:58 tgr@deploy2002: tgr: Continuing with sync
* 22:56 bking@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
* 22:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89780 and previous config saved to /var/cache/conftool/dbconfig/20260303-224913-marostegui.json
* 22:45 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:45 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:44 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:44 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:42 tgr@deploy2002: tgr: Backport for [[gerrit:1247689{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247690{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247596{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1247689{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247690{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247596{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007)]]
* 22:26 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 22:26 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 22:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1261 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89779 and previous config saved to /var/cache/conftool/dbconfig/20260303-222324-marostegui.json
* 22:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1261.eqiad.wmnet with reason: Maintenance
* 22:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89778 and previous config saved to /var/cache/conftool/dbconfig/20260303-222301-marostegui.json
* 22:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P89777 and previous config saved to /var/cache/conftool/dbconfig/20260303-220754-marostegui.json
* 21:59 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1245162 [[phab:T411807|T411807]] (duration: 12m 15s)
* 21:58 rzl@deploy2002: rzl: Continuing with sync
* 21:56 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:56 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:55 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1245162 [[phab:T411807|T411807]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1245162 [[phab:T411807|T411807]]
* 21:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P89776 and previous config saved to /var/cache/conftool/dbconfig/20260303-215247-marostegui.json
* 21:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89775 and previous config saved to /var/cache/conftool/dbconfig/20260303-214931-marostegui.json
* 21:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2045.codfw.wmnet
* 21:48 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp2045.codfw.wmnet
* 21:40 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:39 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89774 and previous config saved to /var/cache/conftool/dbconfig/20260303-213739-marostegui.json
* 21:35 jhuneidi@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244748{{!}}REST: show the beta Attribution API in the REST Sandbox (T418522)]] (duration: 07m 41s)
* 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P89773 and previous config saved to /var/cache/conftool/dbconfig/20260303-213423-marostegui.json
* 21:32 jhuneidi@deploy2002: jhuneidi, bpirkle: Continuing with sync
* 21:30 jhuneidi@deploy2002: jhuneidi, bpirkle: Backport for [[gerrit:1244748{{!}}REST: show the beta Attribution API in the REST Sandbox (T418522)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:28 jhuneidi@deploy2002: Started scap sync-world: Backport for [[gerrit:1244748{{!}}REST: show the beta Attribution API in the REST Sandbox (T418522)]]
* 21:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P89772 and previous config saved to /var/cache/conftool/dbconfig/20260303-211915-marostegui.json
* 21:18 jhuneidi@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247652{{!}}Remove redundant mw-extra wgRestSandboxSpecs entry]] (duration: 06m 56s)
* 21:14 jhuneidi@deploy2002: jhuneidi, aaron: Continuing with sync
* 21:13 jhuneidi@deploy2002: jhuneidi, aaron: Backport for [[gerrit:1247652{{!}}Remove redundant mw-extra wgRestSandboxSpecs entry]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:11 jhuneidi@deploy2002: Started scap sync-world: Backport for [[gerrit:1247652{{!}}Remove redundant mw-extra wgRestSandboxSpecs entry]]
* 21:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1260 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89771 and previous config saved to /var/cache/conftool/dbconfig/20260303-211033-marostegui.json
* 21:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1260.eqiad.wmnet with reason: Maintenance
* 21:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89770 and previous config saved to /var/cache/conftool/dbconfig/20260303-211009-marostegui.json
* 21:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89769 and previous config saved to /var/cache/conftool/dbconfig/20260303-210407-marostegui.json
* 20:58 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2045.codfw.wmnet with reason: troubleshooting for [[phab:T418527|T418527]]
* 20:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P89768 and previous config saved to /var/cache/conftool/dbconfig/20260303-205502-marostegui.json
* 20:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7008.magru.wmnet with OS trixie
* 20:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89767 and previous config saved to /var/cache/conftool/dbconfig/20260303-204452-marostegui.json
* 20:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2248.codfw.wmnet with reason: Maintenance
* 20:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89766 and previous config saved to /var/cache/conftool/dbconfig/20260303-204439-marostegui.json
* 20:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P89765 and previous config saved to /var/cache/conftool/dbconfig/20260303-203954-marostegui.json
* 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P89764 and previous config saved to /var/cache/conftool/dbconfig/20260303-202931-marostegui.json
* 20:24 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7008.magru.wmnet with reason: host reimage
* 20:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89763 and previous config saved to /var/cache/conftool/dbconfig/20260303-202447-marostegui.json
* 20:17 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7008.magru.wmnet with reason: host reimage
* 20:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P89762 and previous config saved to /var/cache/conftool/dbconfig/20260303-201423-marostegui.json
* 20:10 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1199.eqiad.wmnet
* 19:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89761 and previous config saved to /var/cache/conftool/dbconfig/20260303-195916-marostegui.json
* 19:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1252 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89760 and previous config saved to /var/cache/conftool/dbconfig/20260303-195900-marostegui.json
* 19:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1252.eqiad.wmnet with reason: Maintenance
* 19:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89759 and previous config saved to /var/cache/conftool/dbconfig/20260303-195835-marostegui.json
* 19:51 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7008.magru.wmnet with OS trixie
* 19:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P89758 and previous config saved to /var/cache/conftool/dbconfig/20260303-194327-marostegui.json
* 19:42 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2043.codfw.wmnet
* 19:42 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp2043.codfw.wmnet
* 19:33 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89757 and previous config saved to /var/cache/conftool/dbconfig/20260303-193351-marostegui.json
* 19:33 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 19:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89756 and previous config saved to /var/cache/conftool/dbconfig/20260303-193338-marostegui.json
* 19:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P89755 and previous config saved to /var/cache/conftool/dbconfig/20260303-192820-marostegui.json
* 19:19 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 19:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P89754 and previous config saved to /var/cache/conftool/dbconfig/20260303-191830-marostegui.json
* 19:17 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2047.codfw.wmnet with OS trixie
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89753 and previous config saved to /var/cache/conftool/dbconfig/20260303-191312-marostegui.json
* 19:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P89752 and previous config saved to /var/cache/conftool/dbconfig/20260303-190323-marostegui.json
* 18:53 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 18:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
* 18:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1198.eqiad.wmnet
* 18:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1249 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89751 and previous config saved to /var/cache/conftool/dbconfig/20260303-184937-marostegui.json
* 18:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1249.eqiad.wmnet with reason: Maintenance
* 18:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89750 and previous config saved to /var/cache/conftool/dbconfig/20260303-184913-marostegui.json
* 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89749 and previous config saved to /var/cache/conftool/dbconfig/20260303-184815-marostegui.json
* 18:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 18:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1096.eqiad.wmnet with OS bullseye
* 18:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1198.eqiad.wmnet
* 18:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1197.eqiad.wmnet
* 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P89747 and previous config saved to /var/cache/conftool/dbconfig/20260303-183406-marostegui.json
* 18:29 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp2047.codfw.wmnet with OS trixie
* 18:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1197.eqiad.wmnet
* 18:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1196.eqiad.wmnet
* 18:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89746 and previous config saved to /var/cache/conftool/dbconfig/20260303-182346-marostegui.json
* 18:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1096.eqiad.wmnet with reason: host reimage
* 18:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 18:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89745 and previous config saved to /var/cache/conftool/dbconfig/20260303-182321-marostegui.json
* 18:19 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1096.eqiad.wmnet with reason: host reimage
* 18:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P89744 and previous config saved to /var/cache/conftool/dbconfig/20260303-181859-marostegui.json
* 18:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1196.eqiad.wmnet
* 18:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1195.eqiad.wmnet
* 18:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P89743 and previous config saved to /var/cache/conftool/dbconfig/20260303-180814-marostegui.json
* 18:04 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247635{{!}}Style fixes for copy-paste feature (T414072)]] (duration: 32m 54s)
* 18:04 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89742 and previous config saved to /var/cache/conftool/dbconfig/20260303-180352-marostegui.json
* 18:02 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1096.eqiad.wmnet with OS bullseye
* 18:02 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1195.eqiad.wmnet
* 17:59 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host an-worker1194.eqiad.wmnet
* 17:55 ariel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:53 ariel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P89741 and previous config saved to /var/cache/conftool/dbconfig/20260303-175304-marostegui.json
* 17:52 jforrester@deploy2002: jforrester: Continuing with sync
* 17:51 jforrester@deploy2002: jforrester: Backport for [[gerrit:1247635{{!}}Style fixes for copy-paste feature (T414072)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:47 ariel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:46 ariel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1194.eqiad.wmnet
* 17:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1193.eqiad.wmnet
* 17:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89740 and previous config saved to /var/cache/conftool/dbconfig/20260303-173914-marostegui.json
* 17:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1248.eqiad.wmnet with reason: Maintenance
* 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89739 and previous config saved to /var/cache/conftool/dbconfig/20260303-173850-marostegui.json
* 17:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89738 and previous config saved to /var/cache/conftool/dbconfig/20260303-173756-marostegui.json
* 17:31 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1247635{{!}}Style fixes for copy-paste feature (T414072)]]
* 17:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1193.eqiad.wmnet
* 17:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1192.eqiad.wmnet
* 17:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P89736 and previous config saved to /var/cache/conftool/dbconfig/20260303-172343-marostegui.json
* 17:18 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1192.eqiad.wmnet
* 17:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1191.eqiad.wmnet
* 17:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89735 and previous config saved to /var/cache/conftool/dbconfig/20260303-171149-marostegui.json
* 17:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 17:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89734 and previous config saved to /var/cache/conftool/dbconfig/20260303-171126-marostegui.json
* 17:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P89733 and previous config saved to /var/cache/conftool/dbconfig/20260303-170835-marostegui.json
* 17:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1191.eqiad.wmnet
* 17:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1190.eqiad.wmnet
* 16:56 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1190.eqiad.wmnet
* 16:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P89732 and previous config saved to /var/cache/conftool/dbconfig/20260303-165618-marostegui.json
* 16:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89731 and previous config saved to /var/cache/conftool/dbconfig/20260303-165327-marostegui.json
* 16:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1189.eqiad.wmnet
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P89730 and previous config saved to /var/cache/conftool/dbconfig/20260303-164111-marostegui.json
* 16:34 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1189.eqiad.wmnet
* 16:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1188.eqiad.wmnet
* 16:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89729 and previous config saved to /var/cache/conftool/dbconfig/20260303-162845-marostegui.json
* 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Setting x1 codfw weights to 300 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89728 and previous config saved to /var/cache/conftool/dbconfig/20260303-162836-fceratto.json
* 16:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1247.eqiad.wmnet with reason: Maintenance
* 16:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89727 and previous config saved to /var/cache/conftool/dbconfig/20260303-162603-marostegui.json
* 16:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 16:18 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1188 weight to 100 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89726 and previous config saved to /var/cache/conftool/dbconfig/20260303-161846-fceratto.json
* 16:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 16:17 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1188.eqiad.wmnet
* 16:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1187.eqiad.wmnet
* 16:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1166: testing:crash
* 16:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1166: testing:crash
* 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1169 weight to 300 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89724 and previous config saved to /var/cache/conftool/dbconfig/20260303-161323-fceratto.json
* 16:12 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1188 weight to 300 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89723 and previous config saved to /var/cache/conftool/dbconfig/20260303-161230-fceratto.json
* 16:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 16:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89722 and previous config saved to /var/cache/conftool/dbconfig/20260303-160720-marostegui.json
* 16:07 brennen@deploy2002: Finished deploy [phabricator/deployment@a883b6d]: deploy phab1004 for [[phab:T418872|T418872]] (duration: 01m 07s)
* 16:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1187.eqiad.wmnet
* 16:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1186.eqiad.wmnet
* 16:05 brennen@deploy2002: Started deploy [phabricator/deployment@a883b6d]: deploy phab1004 for [[phab:T418872|T418872]]
* 16:05 brennen@deploy2002: Finished deploy [phabricator/deployment@a883b6d]: deploy phab2002 for [[phab:T418872|T418872]] (duration: 00m 32s)
* 16:04 brennen@deploy2002: Started deploy [phabricator/deployment@a883b6d]: deploy phab2002 for [[phab:T418872|T418872]]
* 16:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2240 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89721 and previous config saved to /var/cache/conftool/dbconfig/20260303-160207-marostegui.json
* 16:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 16:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2240.codfw.wmnet with reason: Maintenance
* 16:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 16:00 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247569{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]] (duration: 09m 28s)
* 15:54 zabe@deploy2002: zabe: Continuing with sync
* 15:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1186.eqiad.wmnet
* 15:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1185.eqiad.wmnet
* 15:54 zabe@deploy2002: zabe: Backport for [[gerrit:1247569{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:53 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 15:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P89720 and previous config saved to /var/cache/conftool/dbconfig/20260303-155212-marostegui.json
* 15:50 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1247569{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]]
* 15:49 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 15:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:42 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1185.eqiad.wmnet
* 15:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1184.eqiad.wmnet
* 15:42 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:41 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 15:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 15:41 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 15:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89719 and previous config saved to /var/cache/conftool/dbconfig/20260303-154104-marostegui.json
* 15:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P89718 and previous config saved to /var/cache/conftool/dbconfig/20260303-153704-marostegui.json
* 15:36 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 15:36 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 15:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1184.eqiad.wmnet
* 15:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1183.eqiad.wmnet
* 15:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P89717 and previous config saved to /var/cache/conftool/dbconfig/20260303-152557-marostegui.json
* 15:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1178.eqiad.wmnet
* 15:22 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp5032.*<nowiki>}</nowiki> and A:cp - 3.0 upgrade ()
* 15:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89716 and previous config saved to /var/cache/conftool/dbconfig/20260303-152157-marostegui.json
* 15:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1183.eqiad.wmnet
* 15:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1182.eqiad.wmnet
* 15:16 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp5032.*<nowiki>}</nowiki> and A:cp - 3.0 upgrade ()
* 15:15 fabfur@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=1) rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 15:14 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 15:14 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 15:13 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 15:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 15:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P89715 and previous config saved to /var/cache/conftool/dbconfig/20260303-151049-marostegui.json
* 15:08 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:07 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1182.eqiad.wmnet
* 15:07 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1181.eqiad.wmnet
* 14:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1244 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89714 and previous config saved to /var/cache/conftool/dbconfig/20260303-145727-marostegui.json
* 14:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1244.eqiad.wmnet with reason: Maintenance
* 14:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89713 and previous config saved to /var/cache/conftool/dbconfig/20260303-145704-marostegui.json
* 14:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89712 and previous config saved to /var/cache/conftool/dbconfig/20260303-145541-marostegui.json
* 14:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1181.eqiad.wmnet
* 14:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1180.eqiad.wmnet
* 14:49 moritzm: installing php7.4 security updates
* 14:46 jayme@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 14:46 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 14:43 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1180.eqiad.wmnet
* 14:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1179.eqiad.wmnet
* 14:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P89711 and previous config saved to /var/cache/conftool/dbconfig/20260303-144156-marostegui.json
* 14:38 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 14:38 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240716{{!}}Remove Editing-related config for special wikis (T400063)]] (duration: 06m 34s)
* 14:36 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:34 esanders@deploy2002: esanders: Continuing with sync
* 14:34 esanders@deploy2002: esanders: Backport for [[gerrit:1240716{{!}}Remove Editing-related config for special wikis (T400063)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:34 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:32 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1240716{{!}}Remove Editing-related config for special wikis (T400063)]]
* 14:31 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1179.eqiad.wmnet
* 14:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1178.eqiad.wmnet
* 14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89710 and previous config saved to /var/cache/conftool/dbconfig/20260303-143141-marostegui.json
* 14:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89709 and previous config saved to /var/cache/conftool/dbconfig/20260303-143117-marostegui.json
* 14:29 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247578{{!}}PasteCheck: Enable by default (T405127)]] (duration: 08m 01s)
* 14:27 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 14:27 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 14:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P89708 and previous config saved to /var/cache/conftool/dbconfig/20260303-142649-marostegui.json
* 14:26 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 14:25 esanders@deploy2002: esanders: Continuing with sync
* 14:23 esanders@deploy2002: esanders: Backport for [[gerrit:1247578{{!}}PasteCheck: Enable by default (T405127)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:21 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1247578{{!}}PasteCheck: Enable by default (T405127)]]
* 14:20 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 14:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P89707 and previous config saved to /var/cache/conftool/dbconfig/20260303-141610-marostegui.json
* 14:15 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247576{{!}}Enable Wikibase GraphQL on test.wikidata.org (T417619)]], [[gerrit:1247577{{!}}Enable Wikibase GraphQL on production wikidata.org (T417619)]] (duration: 08m 17s)
* 14:11 esanders@deploy2002: esanders, jakob: Continuing with sync
* 14:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89706 and previous config saved to /var/cache/conftool/dbconfig/20260303-141142-marostegui.json
* 14:09 esanders@deploy2002: esanders, jakob: Backport for [[gerrit:1247576{{!}}Enable Wikibase GraphQL on test.wikidata.org (T417619)]], [[gerrit:1247577{{!}}Enable Wikibase GraphQL on production wikidata.org (T417619)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1247576{{!}}Enable Wikibase GraphQL on test.wikidata.org (T417619)]], [[gerrit:1247577{{!}}Enable Wikibase GraphQL on production wikidata.org (T417619)]]
* 14:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P89704 and previous config saved to /var/cache/conftool/dbconfig/20260303-140102-marostegui.json
* 13:47 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1243 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89703 and previous config saved to /var/cache/conftool/dbconfig/20260303-134702-marostegui.json
* 13:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1243.eqiad.wmnet with reason: Maintenance
* 13:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89702 and previous config saved to /var/cache/conftool/dbconfig/20260303-134639-marostegui.json
* 13:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89701 and previous config saved to /var/cache/conftool/dbconfig/20260303-134554-marostegui.json
* 13:31 moritzm: installing NSS security updates
* 13:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P89700 and previous config saved to /var/cache/conftool/dbconfig/20260303-133131-marostegui.json
* 13:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89699 and previous config saved to /var/cache/conftool/dbconfig/20260303-132414-marostegui.json
* 13:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 13:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89698 and previous config saved to /var/cache/conftool/dbconfig/20260303-132350-marostegui.json
* 13:20 tappof: Thanos: re-enable querier<->ruler cross-site traffic [[phab:T412924|T412924]]
* 13:17 dpogorzelski@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=recommendation-api,name=eqiad
* 13:17 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P89697 and previous config saved to /var/cache/conftool/dbconfig/20260303-131624-marostegui.json
* 13:16 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 13:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1177.eqiad.wmnet
* 13:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1359.eqiad.wmnet with OS trixie
* 13:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P89696 and previous config saved to /var/cache/conftool/dbconfig/20260303-130842-marostegui.json
* 13:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89695 and previous config saved to /var/cache/conftool/dbconfig/20260303-130117-marostegui.json
* 13:01 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:00 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1177.eqiad.wmnet
* 13:00 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1176.eqiad.wmnet
* 12:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1358.eqiad.wmnet with OS trixie
* 12:56 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:55 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:53 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1359.eqiad.wmnet with reason: host reimage
* 12:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P89694 and previous config saved to /var/cache/conftool/dbconfig/20260303-125335-marostegui.json
* 12:52 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1357.eqiad.wmnet with OS trixie
* 12:51 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:50 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:48 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1359.eqiad.wmnet with reason: host reimage
* 12:48 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1356.eqiad.wmnet with OS trixie
* 12:47 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:47 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 12:47 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:47 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1176.eqiad.wmnet
* 12:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1175.eqiad.wmnet
* 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 12:45 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 12:45 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'readability' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 12:43 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1358.eqiad.wmnet with reason: host reimage
* 12:42 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 12:42 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 12:41 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:40 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247559{{!}}Enable thumb steps on private wikis too (T414805)]] (duration: 13m 01s)
* 12:39 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89693 and previous config saved to /var/cache/conftool/dbconfig/20260303-123827-marostegui.json
* 12:36 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1357.eqiad.wmnet with reason: host reimage
* 12:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1242 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89692 and previous config saved to /var/cache/conftool/dbconfig/20260303-123642-marostegui.json
* 12:36 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1359.eqiad.wmnet with OS trixie
* 12:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1242.eqiad.wmnet with reason: Maintenance
* 12:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89691 and previous config saved to /var/cache/conftool/dbconfig/20260303-123619-marostegui.json
* 12:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1175.eqiad.wmnet
* 12:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1174.eqiad.wmnet
* 12:34 dpogorzelski@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=recommendation-api,name=eqiad
* 12:33 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 12:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1356.eqiad.wmnet with reason: host reimage
* 12:31 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:31 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:31 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:31 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1247559{{!}}Enable thumb steps on private wikis too (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:30 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1358.eqiad.wmnet with reason: host reimage
* 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1357.eqiad.wmnet with reason: host reimage
* 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1356.eqiad.wmnet with reason: host reimage
* 12:27 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:27 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1247559{{!}}Enable thumb steps on private wikis too (T414805)]]
* 12:26 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1174.eqiad.wmnet
* 12:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1173.eqiad.wmnet
* 12:21 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P89690 and previous config saved to /var/cache/conftool/dbconfig/20260303-122112-marostegui.json
* 12:20 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:20 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:19 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1353.eqiad.wmnet with OS trixie
* 12:16 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1358.eqiad.wmnet with OS trixie
* 12:16 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1357.eqiad.wmnet with OS trixie
* 12:15 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1356.eqiad.wmnet with OS trixie
* 12:14 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1355.eqiad.wmnet with OS trixie
* 12:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89689 and previous config saved to /var/cache/conftool/dbconfig/20260303-121420-marostegui.json
* 12:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 12:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89688 and previous config saved to /var/cache/conftool/dbconfig/20260303-121355-marostegui.json
* 12:09 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1354.eqiad.wmnet with OS trixie
* 12:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1173.eqiad.wmnet
* 12:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1172.eqiad.wmnet
* 12:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P89687 and previous config saved to /var/cache/conftool/dbconfig/20260303-120604-marostegui.json
* 12:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1352.eqiad.wmnet with OS trixie
* 12:02 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1353.eqiad.wmnet with reason: host reimage
* 11:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P89686 and previous config saved to /var/cache/conftool/dbconfig/20260303-115847-marostegui.json
* 11:58 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1355.eqiad.wmnet with reason: host reimage
* 11:52 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1354.eqiad.wmnet with reason: host reimage
* 11:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89685 and previous config saved to /var/cache/conftool/dbconfig/20260303-115057-marostegui.json
* 11:48 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1352.eqiad.wmnet with reason: host reimage
* 11:44 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1355.eqiad.wmnet with reason: host reimage
* 11:43 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1354.eqiad.wmnet with reason: host reimage
* 11:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P89684 and previous config saved to /var/cache/conftool/dbconfig/20260303-114341-marostegui.json
* 11:43 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1353.eqiad.wmnet with reason: host reimage
* 11:42 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1352.eqiad.wmnet with reason: host reimage
* 11:40 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 11:36 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 11:31 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1355.eqiad.wmnet with OS trixie
* 11:31 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1354.eqiad.wmnet with OS trixie
* 11:30 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1353.eqiad.wmnet with OS trixie
* 11:30 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1352.eqiad.wmnet with OS trixie
* 11:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T418465|T418465]])', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260303-112828-marostegui.json
* 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1241 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89683 and previous config saved to /var/cache/conftool/dbconfig/20260303-112535-marostegui.json
* 11:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1241.eqiad.wmnet with reason: Maintenance
* 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89682 and previous config saved to /var/cache/conftool/dbconfig/20260303-112511-marostegui.json
* 11:21 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:18 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:18 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:17 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:17 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:16 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1350-1351].eqiad.wmnet
* 11:16 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1350-1351].eqiad.wmnet
* 11:15 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:15 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:15 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:15 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:14 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 11:14 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 11:13 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 11:13 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 11:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1172.eqiad.wmnet
* 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1171.eqiad.wmnet
* 11:13 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 11:13 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 11:12 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 11:11 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 11:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P89681 and previous config saved to /var/cache/conftool/dbconfig/20260303-111003-marostegui.json
* 11:09 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:08 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 11:08 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 11:07 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 11:07 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 11:06 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 11:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89680 and previous config saved to /var/cache/conftool/dbconfig/20260303-110551-marostegui.json
* 11:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2210.codfw.wmnet with reason: Maintenance
* 11:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89679 and previous config saved to /var/cache/conftool/dbconfig/20260303-110527-marostegui.json
* 10:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1171.eqiad.wmnet
* 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1170.eqiad.wmnet
* 10:57 slyngshede@dns1004: END - running authdns-update
* 10:55 slyngshede@dns1004: START - running authdns-update
* 10:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P89678 and previous config saved to /var/cache/conftool/dbconfig/20260303-105455-marostegui.json
* 10:54 hashar@deploy2002: Finished deploy [gerrit/gerrit@12177b1]: wm-checks-api: add tag for Selenium jobs (duration: 00m 13s)
* 10:54 hashar@deploy2002: Started deploy [gerrit/gerrit@12177b1]: wm-checks-api: add tag for Selenium jobs
* 10:51 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 10:51 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 10:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P89677 and previous config saved to /var/cache/conftool/dbconfig/20260303-105020-marostegui.json
* 10:47 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1170.eqiad.wmnet
* 10:45 fabfur: start upgrading haproxy to 3.0 on A:cp-eqsin ([[phab:T417253|T417253]])
* 10:43 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 10:41 moritzm: installing Django security updates
* 10:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89676 and previous config saved to /var/cache/conftool/dbconfig/20260303-103947-marostegui.json
* 10:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P89675 and previous config saved to /var/cache/conftool/dbconfig/20260303-103512-marostegui.json
* 10:34 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:33 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:31 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 10:25 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 10:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89674 and previous config saved to /var/cache/conftool/dbconfig/20260303-102004-marostegui.json
* 10:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1238 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89673 and previous config saved to /var/cache/conftool/dbconfig/20260303-101800-marostegui.json
* 10:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1238.eqiad.wmnet with reason: Maintenance
* 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89672 and previous config saved to /var/cache/conftool/dbconfig/20260303-101747-marostegui.json
* 09:57 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89670 and previous config saved to /var/cache/conftool/dbconfig/20260303-095655-marostegui.json
* 09:56 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 09:53 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:51 moritzm: installing qemu security updates
* 09:48 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 09:48 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 09:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P89669 and previous config saved to /var/cache/conftool/dbconfig/20260303-094732-marostegui.json
* 09:47 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 09:47 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 09:45 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 09:45 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 09:44 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 09:44 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 09:44 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 09:44 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1176.eqiad.wmnet with reason: host reimage
* 09:43 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 09:40 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 09:38 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1176.eqiad.wmnet with reason: host reimage
* 09:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2199.codfw.wmnet with reason: Maintenance
* 09:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89668 and previous config saved to /var/cache/conftool/dbconfig/20260303-093542-marostegui.json
* 09:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89667 and previous config saved to /var/cache/conftool/dbconfig/20260303-093224-marostegui.json
* 09:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 09:23 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 09:23 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1176.eqiad.wmnet with OS trixie
* 09:21 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 09:20 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 09:20 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 09:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P89666 and previous config saved to /var/cache/conftool/dbconfig/20260303-092034-marostegui.json
* 09:19 arnaudb@dns1004: END - running authdns-update
* 09:18 arnaudb@dns1004: START - running authdns-update
* 09:17 moritzm: installing libbpf updates from Bookworm point release
* 09:08 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89665 and previous config saved to /var/cache/conftool/dbconfig/20260303-090818-marostegui.json
* 09:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on 6 hosts with reason: Maintenance
* 09:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1221.eqiad.wmnet with reason: Maintenance
* 09:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89664 and previous config saved to /var/cache/conftool/dbconfig/20260303-090731-marostegui.json
* 09:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P89663 and previous config saved to /var/cache/conftool/dbconfig/20260303-090526-marostegui.json
* 08:54 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
* 08:53 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
* 08:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P89662 and previous config saved to /var/cache/conftool/dbconfig/20260303-085224-marostegui.json
* 08:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89661 and previous config saved to /var/cache/conftool/dbconfig/20260303-085019-marostegui.json
* 08:47 moritzm: powercycling lvs1013
* 08:41 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 08:41 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 08:37 fabfur: start upgrading haproxy to 3.0 on A:cp-ulsfo ([[phab:T417253|T417253]])
* 08:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P89660 and previous config saved to /var/cache/conftool/dbconfig/20260303-083716-marostegui.json
* 08:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 08:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:31 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 08:30 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 08:28 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
* 08:27 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
* 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89659 and previous config saved to /var/cache/conftool/dbconfig/20260303-082424-marostegui.json
* 08:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89658 and previous config saved to /var/cache/conftool/dbconfig/20260303-082400-marostegui.json
* 08:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89657 and previous config saved to /var/cache/conftool/dbconfig/20260303-082209-marostegui.json
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P89656 and previous config saved to /var/cache/conftool/dbconfig/20260303-080853-marostegui.json
* 08:07 moritzm: installing PAM security updates on Bookworm
* 07:55 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1199 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89655 and previous config saved to /var/cache/conftool/dbconfig/20260303-075526-marostegui.json
* 07:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1199.eqiad.wmnet with reason: Maintenance
* 07:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89654 and previous config saved to /var/cache/conftool/dbconfig/20260303-075502-marostegui.json
* 07:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P89653 and previous config saved to /var/cache/conftool/dbconfig/20260303-075345-marostegui.json
* 07:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P89652 and previous config saved to /var/cache/conftool/dbconfig/20260303-073955-marostegui.json
* 07:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89651 and previous config saved to /var/cache/conftool/dbconfig/20260303-073838-marostegui.json
* 07:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P89650 and previous config saved to /var/cache/conftool/dbconfig/20260303-072447-marostegui.json
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89649 and previous config saved to /var/cache/conftool/dbconfig/20260303-071054-marostegui.json
* 07:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89648 and previous config saved to /var/cache/conftool/dbconfig/20260303-071029-marostegui.json
* 07:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89647 and previous config saved to /var/cache/conftool/dbconfig/20260303-070940-marostegui.json
* 06:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P89646 and previous config saved to /var/cache/conftool/dbconfig/20260303-065523-marostegui.json
* 06:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1190 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89645 and previous config saved to /var/cache/conftool/dbconfig/20260303-064405-marostegui.json
* 06:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P89644 and previous config saved to /var/cache/conftool/dbconfig/20260303-064015-marostegui.json
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2240 gradually with 4 steps - repool after schema change
* 06:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89642 and previous config saved to /var/cache/conftool/dbconfig/20260303-062507-marostegui.json
* 05:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2147 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89639 and previous config saved to /var/cache/conftool/dbconfig/20260303-055834-marostegui.json
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2147.codfw.wmnet with reason: Maintenance
* 05:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 05:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 05:48 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2240 gradually with 4 steps - repool after schema change
* 05:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 05:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.15 (duration: 01m 10s)
* 04:43 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.18 refs [[phab:T413809|T413809]] (duration: 39m 43s)
* 04:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 03:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 03:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89637 and previous config saved to /var/cache/conftool/dbconfig/20260303-035746-marostegui.json
* 03:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P89636 and previous config saved to /var/cache/conftool/dbconfig/20260303-034239-marostegui.json
* 03:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P89635 and previous config saved to /var/cache/conftool/dbconfig/20260303-032731-marostegui.json
* 03:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89634 and previous config saved to /var/cache/conftool/dbconfig/20260303-031224-marostegui.json
* 03:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1251 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89633 and previous config saved to /var/cache/conftool/dbconfig/20260303-030217-marostegui.json
* 03:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1251.eqiad.wmnet with reason: Maintenance
* 02:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1240.eqiad.wmnet with reason: Maintenance
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 00s)
* 02:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 02:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89632 and previous config saved to /var/cache/conftool/dbconfig/20260303-020817-marostegui.json
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P89631 and previous config saved to /var/cache/conftool/dbconfig/20260303-015309-marostegui.json
* 01:42 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mwlog2003.codfw.wmnet with OS trixie
* 01:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P89630 and previous config saved to /var/cache/conftool/dbconfig/20260303-013802-marostegui.json
* 01:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89629 and previous config saved to /var/cache/conftool/dbconfig/20260303-013719-marostegui.json
* 01:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89628 and previous config saved to /var/cache/conftool/dbconfig/20260303-012254-marostegui.json
* 01:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P89627 and previous config saved to /var/cache/conftool/dbconfig/20260303-012211-marostegui.json
* 01:19 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwlog2003.codfw.wmnet with reason: host reimage
* 01:11 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mwlog2003.codfw.wmnet with reason: host reimage
* 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1235 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89626 and previous config saved to /var/cache/conftool/dbconfig/20260303-011151-marostegui.json
* 01:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1235.eqiad.wmnet with reason: Maintenance
* 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89625 and previous config saved to /var/cache/conftool/dbconfig/20260303-011128-marostegui.json
* 01:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P89624 and previous config saved to /var/cache/conftool/dbconfig/20260303-010703-marostegui.json
* 00:59 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247189{{!}}Revert "ImageListPager: Properly support file schema migration read new"]] (duration: 08m 12s)
* 00:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P89623 and previous config saved to /var/cache/conftool/dbconfig/20260303-005620-marostegui.json
* 00:56 zabe@deploy2002: zabe: Continuing with sync
* 00:54 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog2003.codfw.wmnet with OS trixie
* 00:53 zabe@deploy2002: zabe: Backport for [[gerrit:1247189{{!}}Revert "ImageListPager: Properly support file schema migration read new"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mwlog2003.codfw.wmnet with OS trixie
* 00:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89622 and previous config saved to /var/cache/conftool/dbconfig/20260303-005156-marostegui.json
* 00:51 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1247189{{!}}Revert "ImageListPager: Properly support file schema migration read new"]]
* 00:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P89621 and previous config saved to /var/cache/conftool/dbconfig/20260303-004112-marostegui.json
* 00:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2216 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89620 and previous config saved to /var/cache/conftool/dbconfig/20260303-004056-marostegui.json
* 00:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2216.codfw.wmnet with reason: Maintenance
* 00:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89619 and previous config saved to /var/cache/conftool/dbconfig/20260303-004033-marostegui.json
* 00:31 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mwlog1003.eqiad.wmnet with OS trixie
* 00:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89618 and previous config saved to /var/cache/conftool/dbconfig/20260303-002604-marostegui.json
* 00:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P89617 and previous config saved to /var/cache/conftool/dbconfig/20260303-002525-marostegui.json
* 00:20 zabe@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 00:18 zabe@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 00:18 zabe@deploy2002: Finished scap sync-world: [[phab:T418327|T418327]] (duration: 05m 01s)
* 00:15 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1234 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89616 and previous config saved to /var/cache/conftool/dbconfig/20260303-001504-marostegui.json
* 00:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1234.eqiad.wmnet with reason: Maintenance
* 00:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89615 and previous config saved to /var/cache/conftool/dbconfig/20260303-001440-marostegui.json
* 00:13 zabe@deploy2002: Started scap sync-world: [[phab:T418327|T418327]]
* 00:11 zabe@deploy2002: zabe: Continuing with sync
* 00:10 zabe@deploy2002: zabe: Backport for [[gerrit:1247068{{!}}ImageListPager: Properly support file schema migration read new (T418327)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P89614 and previous config saved to /var/cache/conftool/dbconfig/20260303-001018-marostegui.json
* 00:08 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1247068{{!}}ImageListPager: Properly support file schema migration read new (T418327)]]
== 2026-03-02 ==
* 23:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P89613 and previous config saved to /var/cache/conftool/dbconfig/20260302-235933-marostegui.json
* 23:58 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240320{{!}}Stop writing to il_to on testwiki (T415787)]] (duration: 06m 02s)
* 23:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89612 and previous config saved to /var/cache/conftool/dbconfig/20260302-235511-marostegui.json
* 23:54 zabe@deploy2002: zabe: Continuing with sync
* 23:53 zabe@deploy2002: zabe: Backport for [[gerrit:1240320{{!}}Stop writing to il_to on testwiki (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:52 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1240320{{!}}Stop writing to il_to on testwiki (T415787)]]
* 23:51 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp2058.codfw.wmnet with reason: dcops troubleshooting for [[phab:T418527|T418527]]
* 23:50 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1246880{{!}}multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460)]] (duration: 07m 10s)
* 23:47 zabe@deploy2002: zabe: Continuing with sync
* 23:45 zabe@deploy2002: zabe: Backport for [[gerrit:1246880{{!}}multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P89611 and previous config saved to /var/cache/conftool/dbconfig/20260302-234425-marostegui.json
* 23:44 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog2003.codfw.wmnet with OS trixie
* 23:43 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89610 and previous config saved to /var/cache/conftool/dbconfig/20260302-234350-marostegui.json
* 23:43 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1246880{{!}}multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460)]]
* 23:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2203.codfw.wmnet with reason: Maintenance
* 23:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2202.codfw.wmnet with reason: Maintenance
* 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89609 and previous config saved to /var/cache/conftool/dbconfig/20260302-233517-marostegui.json
* 23:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89608 and previous config saved to /var/cache/conftool/dbconfig/20260302-232918-marostegui.json
* 23:25 dwisehaupt@dns1006: END - running authdns-update
* 23:24 dwisehaupt@dns1006: START - running authdns-update
* 23:23 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwlog1003.eqiad.wmnet with reason: host reimage
* 23:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P89607 and previous config saved to /var/cache/conftool/dbconfig/20260302-232009-marostegui.json
* 23:18 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mwlog1003.eqiad.wmnet with reason: host reimage
* 23:17 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1232 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89606 and previous config saved to /var/cache/conftool/dbconfig/20260302-231723-marostegui.json
* 23:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1232.eqiad.wmnet with reason: Maintenance
* 23:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89605 and previous config saved to /var/cache/conftool/dbconfig/20260302-231658-marostegui.json
* 23:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P89604 and previous config saved to /var/cache/conftool/dbconfig/20260302-230502-marostegui.json
* 23:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P89603 and previous config saved to /var/cache/conftool/dbconfig/20260302-230151-marostegui.json
* 22:57 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog1003.eqiad.wmnet with OS trixie
* 22:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89602 and previous config saved to /var/cache/conftool/dbconfig/20260302-224954-marostegui.json
* 22:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P89601 and previous config saved to /var/cache/conftool/dbconfig/20260302-224643-marostegui.json
* 22:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2188 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89600 and previous config saved to /var/cache/conftool/dbconfig/20260302-223612-marostegui.json
* 22:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2188.codfw.wmnet with reason: Maintenance
* 22:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89599 and previous config saved to /var/cache/conftool/dbconfig/20260302-223548-marostegui.json
* 22:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89598 and previous config saved to /var/cache/conftool/dbconfig/20260302-223135-marostegui.json
* 22:21 maryum: Deployed security fix for [[phab:T418179|T418179]]
* 22:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P89597 and previous config saved to /var/cache/conftool/dbconfig/20260302-222041-marostegui.json
* 22:19 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89596 and previous config saved to /var/cache/conftool/dbconfig/20260302-221938-marostegui.json
* 22:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 22:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89595 and previous config saved to /var/cache/conftool/dbconfig/20260302-221925-marostegui.json
* 22:10 aaron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1242613{{!}}Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470)]] (duration: 06m 39s)
* 22:06 aaron@deploy2002: aaron: Continuing with sync
* 22:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P89594 and previous config saved to /var/cache/conftool/dbconfig/20260302-220533-marostegui.json
* 22:05 aaron@deploy2002: aaron: Backport for [[gerrit:1242613{{!}}Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P89593 and previous config saved to /var/cache/conftool/dbconfig/20260302-220418-marostegui.json
* 22:03 aaron@deploy2002: Started scap sync-world: Backport for [[gerrit:1242613{{!}}Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470)]]
* 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-backup2003.codfw.wmnet with OS trixie
* 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-backup2004.codfw.wmnet with OS trixie
* 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 22:03 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 22:02 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 22:01 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247149{{!}}ApiCSPReport: Use structured logging for CSP reports]] (duration: 08m 19s)
* 21:57 catrope@deploy2002: catrope: Continuing with sync
* 21:56 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 21:55 catrope@deploy2002: catrope: Backport for [[gerrit:1247149{{!}}ApiCSPReport: Use structured logging for CSP reports]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:53 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1247149{{!}}ApiCSPReport: Use structured logging for CSP reports]]
* 21:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89592 and previous config saved to /var/cache/conftool/dbconfig/20260302-215025-marostegui.json
* 21:50 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2043.codfw.wmnet with reason: These are test instances, failing should not notif
* 21:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P89591 and previous config saved to /var/cache/conftool/dbconfig/20260302-214910-marostegui.json
* 21:48 inflatador: bking@desktop restarting wdqs codfw to clear ProbeDown alerts
* 21:43 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cp2043.codfw.wmnet
* 21:42 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup2004.codfw.wmnet with reason: host reimage
* 21:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89590 and previous config saved to /var/cache/conftool/dbconfig/20260302-213957-marostegui.json
* 21:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 21:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89589 and previous config saved to /var/cache/conftool/dbconfig/20260302-213934-marostegui.json
* 21:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup2003.codfw.wmnet with reason: host reimage
* 21:36 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Testing removal of OpenJDK 8 support - eevans@cumin1003
* 21:34 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1226024{{!}}Add Comments namespace for shnwikinews (T414403)]] (duration: 07m 07s)
* 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89588 and previous config saved to /var/cache/conftool/dbconfig/20260302-213402-marostegui.json
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup2004.codfw.wmnet with reason: host reimage
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup2003.codfw.wmnet with reason: host reimage
* 21:30 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp2043.codfw.wmnet
* 21:30 catrope@deploy2002: shivaanshsingh, catrope: Continuing with sync
* 21:29 catrope@deploy2002: shivaanshsingh, catrope: Backport for [[gerrit:1226024{{!}}Add Comments namespace for shnwikinews (T414403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:27 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1226024{{!}}Add Comments namespace for shnwikinews (T414403)]]
* 21:24 kemayo@deploy2002: Finished scap sync-world: Backport for [[gerrit:1243990{{!}}Suggestion Mode: add values for suggestion feedback properties (T401739)]], [[gerrit:1240721{{!}}Stop PasteCheck A/B test (T417429)]] (duration: 10m 55s)
* 21:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P89587 and previous config saved to /var/cache/conftool/dbconfig/20260302-212426-marostegui.json
* 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89586 and previous config saved to /var/cache/conftool/dbconfig/20260302-212345-marostegui.json
* 21:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89585 and previous config saved to /var/cache/conftool/dbconfig/20260302-212321-marostegui.json
* 21:20 kemayo@deploy2002: esanders, kemayo, caro: Continuing with sync
* 21:17 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-backup2004.codfw.wmnet with OS trixie
* 21:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-backup2003.codfw.wmnet with OS trixie
* 21:16 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Testing removal of OpenJDK 8 support - eevans@cumin1003
* 21:16 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ms-backup2003']
* 21:15 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-backup2003']
* 21:15 kemayo@deploy2002: esanders, kemayo, caro: Backport for [[gerrit:1243990{{!}}Suggestion Mode: add values for suggestion feedback properties (T401739)]], [[gerrit:1240721{{!}}Stop PasteCheck A/B test (T417429)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:14 inflatador: bking@apt1002 reprepro --component thirdparty/opensearch3 update trixie-wikimedia [[phab:T418388|T418388]]
* 21:13 kemayo@deploy2002: Started scap sync-world: Backport for [[gerrit:1243990{{!}}Suggestion Mode: add values for suggestion feedback properties (T401739)]], [[gerrit:1240721{{!}}Stop PasteCheck A/B test (T417429)]]
* 21:12 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-backup2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:11 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-backup2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:10 dani@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247107{{!}}Undeploy Comparative Reader Research survey on eswiki (T417834)]], [[gerrit:1247105{{!}}Undeploy Comparative Reader Research survey on enwiki (T417829)]] (duration: 06m 52s)
* 21:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P89584 and previous config saved to /var/cache/conftool/dbconfig/20260302-210919-marostegui.json
* 21:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P89583 and previous config saved to /var/cache/conftool/dbconfig/20260302-210813-marostegui.json
* 21:06 dani@deploy2002: dani: Continuing with sync
* 21:05 dani@deploy2002: dani: Backport for [[gerrit:1247107{{!}}Undeploy Comparative Reader Research survey on eswiki (T417834)]], [[gerrit:1247105{{!}}Undeploy Comparative Reader Research survey on enwiki (T417829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:03 dani@deploy2002: Started scap sync-world: Backport for [[gerrit:1247107{{!}}Undeploy Comparative Reader Research survey on eswiki (T417834)]], [[gerrit:1247105{{!}}Undeploy Comparative Reader Research survey on enwiki (T417829)]]
* 20:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-backup2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-backup2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-backup2004
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-backup2004
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-backup2003
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-backup2003
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-backup2003 to codfw - jhancock@cumin2002"
* 20:54 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-backup2003 to codfw - jhancock@cumin2002"
* 20:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89582 and previous config saved to /var/cache/conftool/dbconfig/20260302-205411-marostegui.json
* 20:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P89581 and previous config saved to /var/cache/conftool/dbconfig/20260302-205307-marostegui.json
* 20:50 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 20:42 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89580 and previous config saved to /var/cache/conftool/dbconfig/20260302-204136-marostegui.json
* 20:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 20:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89579 and previous config saved to /var/cache/conftool/dbconfig/20260302-204112-marostegui.json
* 20:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:39 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89578 and previous config saved to /var/cache/conftool/dbconfig/20260302-203759-marostegui.json
* 20:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:31 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89577 and previous config saved to /var/cache/conftool/dbconfig/20260302-202740-marostegui.json
* 20:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 20:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89576 and previous config saved to /var/cache/conftool/dbconfig/20260302-202716-marostegui.json
* 20:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P89575 and previous config saved to /var/cache/conftool/dbconfig/20260302-202604-marostegui.json
* 20:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P89574 and previous config saved to /var/cache/conftool/dbconfig/20260302-201209-marostegui.json
* 20:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P89573 and previous config saved to /var/cache/conftool/dbconfig/20260302-201057-marostegui.json
* 20:01 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 20:00 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 19:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P89572 and previous config saved to /var/cache/conftool/dbconfig/20260302-195702-marostegui.json
* 19:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89571 and previous config saved to /var/cache/conftool/dbconfig/20260302-195549-marostegui.json
* 19:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89570 and previous config saved to /var/cache/conftool/dbconfig/20260302-194435-marostegui.json
* 19:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 19:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89569 and previous config saved to /var/cache/conftool/dbconfig/20260302-194411-marostegui.json
* 19:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89568 and previous config saved to /var/cache/conftool/dbconfig/20260302-194155-marostegui.json
* 19:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89566 and previous config saved to /var/cache/conftool/dbconfig/20260302-193119-marostegui.json
* 19:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 19:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 19:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89565 and previous config saved to /var/cache/conftool/dbconfig/20260302-193046-marostegui.json
* 19:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P89564 and previous config saved to /var/cache/conftool/dbconfig/20260302-192903-marostegui.json
* 19:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P89563 and previous config saved to /var/cache/conftool/dbconfig/20260302-191539-marostegui.json
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P89562 and previous config saved to /var/cache/conftool/dbconfig/20260302-191355-marostegui.json
* 19:12 dzahn@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:12 dzahn@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:04 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2095.codfw.wmnet with OS bullseye
* 19:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P89561 and previous config saved to /var/cache/conftool/dbconfig/20260302-190032-marostegui.json
* 18:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89560 and previous config saved to /var/cache/conftool/dbconfig/20260302-185848-marostegui.json
* 18:54 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 18:53 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89559 and previous config saved to /var/cache/conftool/dbconfig/20260302-184832-marostegui.json
* 18:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89558 and previous config saved to /var/cache/conftool/dbconfig/20260302-184808-marostegui.json
* 18:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89557 and previous config saved to /var/cache/conftool/dbconfig/20260302-184524-marostegui.json
* 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89556 and previous config saved to /var/cache/conftool/dbconfig/20260302-183449-marostegui.json
* 18:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1195.eqiad.wmnet with reason: Maintenance
* 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89555 and previous config saved to /var/cache/conftool/dbconfig/20260302-183425-marostegui.json
* 18:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P89554 and previous config saved to /var/cache/conftool/dbconfig/20260302-183300-marostegui.json
* 18:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P89553 and previous config saved to /var/cache/conftool/dbconfig/20260302-181918-marostegui.json
* 18:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P89552 and previous config saved to /var/cache/conftool/dbconfig/20260302-181753-marostegui.json
* 18:16 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 18:04 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P89551 and previous config saved to /var/cache/conftool/dbconfig/20260302-180411-marostegui.json
* 18:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89550 and previous config saved to /var/cache/conftool/dbconfig/20260302-180245-marostegui.json
* 18:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:53 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
* 17:53 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
* 17:52 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:52 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89549 and previous config saved to /var/cache/conftool/dbconfig/20260302-174917-marostegui.json
* 17:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 17:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 17:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89548 and previous config saved to /var/cache/conftool/dbconfig/20260302-174903-marostegui.json
* 17:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89547 and previous config saved to /var/cache/conftool/dbconfig/20260302-174854-marostegui.json
* 17:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
* 17:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:42 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:39 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89546 and previous config saved to /var/cache/conftool/dbconfig/20260302-173827-marostegui.json
* 17:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89545 and previous config saved to /var/cache/conftool/dbconfig/20260302-173803-marostegui.json
* 17:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 17:36 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 17:34 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 17:33 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 17:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P89544 and previous config saved to /var/cache/conftool/dbconfig/20260302-173347-marostegui.json
* 17:32 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.update-replication (exit_code=99)
* 17:32 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 17:24 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 17:23 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 17:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P89543 and previous config saved to /var/cache/conftool/dbconfig/20260302-172256-marostegui.json
* 17:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P89542 and previous config saved to /var/cache/conftool/dbconfig/20260302-171839-marostegui.json
* 17:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P89541 and previous config saved to /var/cache/conftool/dbconfig/20260302-170748-marostegui.json
* 17:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89540 and previous config saved to /var/cache/conftool/dbconfig/20260302-170331-marostegui.json
* 16:52 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2230.codfw.wmnet with OS trixie
* 16:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89539 and previous config saved to /var/cache/conftool/dbconfig/20260302-165240-marostegui.json
* 16:51 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2146 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89538 and previous config saved to /var/cache/conftool/dbconfig/20260302-165153-marostegui.json
* 16:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2146.codfw.wmnet with reason: Maintenance
* 16:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89537 and previous config saved to /var/cache/conftool/dbconfig/20260302-165129-marostegui.json
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89536 and previous config saved to /var/cache/conftool/dbconfig/20260302-164141-marostegui.json
* 16:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89535 and previous config saved to /var/cache/conftool/dbconfig/20260302-164118-marostegui.json
* 16:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P89534 and previous config saved to /var/cache/conftool/dbconfig/20260302-163622-marostegui.json
* 16:29 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2230.codfw.wmnet with reason: host reimage
* 16:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P89533 and previous config saved to /var/cache/conftool/dbconfig/20260302-162610-marostegui.json
* 16:21 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2230.codfw.wmnet with reason: host reimage
* 16:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P89532 and previous config saved to /var/cache/conftool/dbconfig/20260302-162115-marostegui.json
* 16:19 moritzm: installing PAM security updates on Bookworm
* 16:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P89531 and previous config saved to /var/cache/conftool/dbconfig/20260302-161102-marostegui.json
* 16:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89530 and previous config saved to /var/cache/conftool/dbconfig/20260302-160607-marostegui.json
* 16:05 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db2230.codfw.wmnet with OS trixie
* 15:56 moritzm: installing glibc bugfix updates from trixie point release
* 15:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89529 and previous config saved to /var/cache/conftool/dbconfig/20260302-155555-marostegui.json
* 15:55 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2145 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89528 and previous config saved to /var/cache/conftool/dbconfig/20260302-155527-marostegui.json
* 15:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2145.codfw.wmnet with reason: Maintenance
* 15:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 15:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1169.eqiad.wmnet
* 15:45 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89527 and previous config saved to /var/cache/conftool/dbconfig/20260302-154520-marostegui.json
* 15:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 15:38 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1169.eqiad.wmnet
* 15:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1167.eqiad.wmnet
* 15:32 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 15:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1209.eqiad.wmnet with reason: Maintenance
* 15:31 marostegui@cumin1003: dbctl commit (dc=all): 'Restore db1226 full weight after schema change', diff saved to https://phabricator.wikimedia.org/P89526 and previous config saved to /var/cache/conftool/dbconfig/20260302-153100-marostegui.json
* 15:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P89525 and previous config saved to /var/cache/conftool/dbconfig/20260302-152334-marostegui.json
* 15:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1167.eqiad.wmnet
* 15:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1166.eqiad.wmnet
* 15:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2198.codfw.wmnet with reason: Maintenance
* 15:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89524 and previous config saved to /var/cache/conftool/dbconfig/20260302-151838-marostegui.json
* 15:10 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1166.eqiad.wmnet
* 15:10 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1165.eqiad.wmnet
* 15:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P89523 and previous config saved to /var/cache/conftool/dbconfig/20260302-150826-marostegui.json
* 15:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P89522 and previous config saved to /var/cache/conftool/dbconfig/20260302-150330-marostegui.json
* 15:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1097.eqiad.wmnet with OS bullseye
* 15:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1165.eqiad.wmnet
* 14:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1164.eqiad.wmnet
* 14:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89520 and previous config saved to /var/cache/conftool/dbconfig/20260302-145318-marostegui.json
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1164.eqiad.wmnet
* 14:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1163.eqiad.wmnet
* 14:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P89519 and previous config saved to /var/cache/conftool/dbconfig/20260302-144823-marostegui.json
* 14:41 Lucas_WMDE: UTC afternoon backport+config window done
* 14:40 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247057{{!}}IPInfo: Set log level to "info" (T374718)]] (duration: 08m 01s)
* 14:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1163.eqiad.wmnet
* 14:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1162.eqiad.wmnet
* 14:36 lucaswerkmeister-wmde@deploy2002: kharlan, lucaswerkmeister-wmde: Continuing with sync
* 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1226 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89517 and previous config saved to /var/cache/conftool/dbconfig/20260302-143608-marostegui.json
* 14:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1226.eqiad.wmnet with reason: Maintenance
* 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89516 and previous config saved to /var/cache/conftool/dbconfig/20260302-143544-marostegui.json
* 14:34 lucaswerkmeister-wmde@deploy2002: kharlan, lucaswerkmeister-wmde: Backport for [[gerrit:1247057{{!}}IPInfo: Set log level to "info" (T374718)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89515 and previous config saved to /var/cache/conftool/dbconfig/20260302-143315-marostegui.json
* 14:32 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [[gerrit:1247057{{!}}IPInfo: Set log level to "info" (T374718)]]
* 14:31 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 14:30 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [[gerrit:1245364{{!}}Add configurations for graphql usage survey and its pipeline tests (T414476)]] (duration: 09m 44s)
* 14:27 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:26 lucaswerkmeister-wmde@deploy2002: itamar, lucaswerkmeister-wmde: Continuing with sync
* 14:26 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 14:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1162.eqiad.wmnet
* 14:25 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1161.eqiad.wmnet
* 14:23 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 14:22 lucaswerkmeister-wmde@deploy2002: itamar, lucaswerkmeister-wmde: Backport for [[gerrit:1245364{{!}}Add configurations for graphql usage survey and its pipeline tests (T414476)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:20 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [[gerrit:1245364{{!}}Add configurations for graphql usage survey and its pipeline tests (T414476)]]
* 14:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P89514 and previous config saved to /var/cache/conftool/dbconfig/20260302-142037-marostegui.json
* 14:19 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:18 lucaswerkmeister-wmde@deploy2002: mwscript-k8s job started: namespaceDupes lawiki --fix # [[phab:T418706|T418706]]
* 14:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89513 and previous config saved to /var/cache/conftool/dbconfig/20260302-141834-marostegui.json
* 14:18 elukey@puppetserver1001: conftool action : set/pooled=no; selector: name=ms-fe1013.eqiad.wmnet
* 14:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2195.codfw.wmnet with reason: Maintenance
* 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
* 14:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89512 and previous config saved to /var/cache/conftool/dbconfig/20260302-141810-marostegui.json
* 14:17 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247063{{!}}lawiki: add Adumbratio (draft) namespace (T418706)]] (duration: 07m 27s)
* 14:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
* 14:13 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, anzx: Continuing with sync
* 14:13 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 14:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1161.eqiad.wmnet
* 14:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1160.eqiad.wmnet
* 14:13 moritzm: installing libcap2 updates from Trixie point release
* 14:12 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, anzx: Backport for [[gerrit:1247063{{!}}lawiki: add Adumbratio (draft) namespace (T418706)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:11 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:11 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 14:10 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:10 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [[gerrit:1247063{{!}}lawiki: add Adumbratio (draft) namespace (T418706)]]
* 14:10 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 14:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1028.eqiad.wmnet
* 14:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 14:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P89511 and previous config saved to /var/cache/conftool/dbconfig/20260302-140529-marostegui.json
* 14:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1097.eqiad.wmnet with reason: host reimage
* 14:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P89510 and previous config saved to /var/cache/conftool/dbconfig/20260302-140302-marostegui.json
* 14:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1028.eqiad.wmnet
* 14:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1160.eqiad.wmnet
* 14:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 14:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1159.eqiad.wmnet
* 14:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1025.eqiad.wmnet
* 13:57 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1097.eqiad.wmnet with reason: host reimage
* 13:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1025.eqiad.wmnet
* 13:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89509 and previous config saved to /var/cache/conftool/dbconfig/20260302-135021-marostegui.json
* 13:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P89508 and previous config saved to /var/cache/conftool/dbconfig/20260302-134754-marostegui.json
* 13:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1159.eqiad.wmnet
* 13:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1158.eqiad.wmnet
* 13:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1097.eqiad.wmnet with OS bullseye
* 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1097
* 13:38 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1097
* 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt ms-be1097 - jclark@cumin1003"
* 13:37 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt ms-be1097 - jclark@cumin1003"
* 13:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1158.eqiad.wmnet
* 13:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1157.eqiad.wmnet
* 13:35 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 13:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1214 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89507 and previous config saved to /var/cache/conftool/dbconfig/20260302-133503-marostegui.json
* 13:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1214.eqiad.wmnet with reason: Maintenance
* 13:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89506 and previous config saved to /var/cache/conftool/dbconfig/20260302-133440-marostegui.json
* 13:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89505 and previous config saved to /var/cache/conftool/dbconfig/20260302-133247-marostegui.json
* 13:28 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 13:27 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:27 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1097
* 13:26 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1097
* 13:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1157.eqiad.wmnet
* 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1156.eqiad.wmnet
* 13:22 brouberol: Running `echo 'https://turnilo-next.wikimedia.org' {{!}} mwscript-k8s --attach -- purgeList.php`
* 13:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P89504 and previous config saved to /var/cache/conftool/dbconfig/20260302-131932-marostegui.json
* 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2181 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89503 and previous config saved to /var/cache/conftool/dbconfig/20260302-131653-marostegui.json
* 13:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2181.codfw.wmnet with reason: Maintenance
* 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89502 and previous config saved to /var/cache/conftool/dbconfig/20260302-131630-marostegui.json
* 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1024.eqiad.wmnet
* 13:14 moritzm: installing libcap2 updates from Bookworm point release
* 13:12 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1156.eqiad.wmnet
* 13:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1155.eqiad.wmnet
* 13:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1024.eqiad.wmnet
* 13:07 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 13:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P89500 and previous config saved to /var/cache/conftool/dbconfig/20260302-130424-marostegui.json
* 13:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P89499 and previous config saved to /var/cache/conftool/dbconfig/20260302-130122-marostegui.json
* 13:00 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 12:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2356.codfw.wmnet
* 12:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2356.codfw.wmnet
* 12:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1155.eqiad.wmnet
* 12:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1154.eqiad.wmnet
* 12:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89498 and previous config saved to /var/cache/conftool/dbconfig/20260302-124917-marostegui.json
* 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1154.eqiad.wmnet
* 12:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1153.eqiad.wmnet
* 12:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P89497 and previous config saved to /var/cache/conftool/dbconfig/20260302-124615-marostegui.json
* 12:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1153.eqiad.wmnet
* 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1152.eqiad.wmnet
* 12:32 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89494 and previous config saved to /var/cache/conftool/dbconfig/20260302-123253-marostegui.json
* 12:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1203.eqiad.wmnet with reason: Maintenance
* 12:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89493 and previous config saved to /var/cache/conftool/dbconfig/20260302-123229-marostegui.json
* 12:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89492 and previous config saved to /var/cache/conftool/dbconfig/20260302-123108-marostegui.json
* 12:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1152.eqiad.wmnet
* 12:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1151.eqiad.wmnet
* 12:23 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P89491 and previous config saved to /var/cache/conftool/dbconfig/20260302-121722-marostegui.json
* 12:15 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89490 and previous config saved to /var/cache/conftool/dbconfig/20260302-121525-marostegui.json
* 12:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 12:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89489 and previous config saved to /var/cache/conftool/dbconfig/20260302-121501-marostegui.json
* 12:12 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1151.eqiad.wmnet
* 12:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1150.eqiad.wmnet
* 12:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P89488 and previous config saved to /var/cache/conftool/dbconfig/20260302-120214-marostegui.json
* 12:00 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1150.eqiad.wmnet
* 11:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P89487 and previous config saved to /var/cache/conftool/dbconfig/20260302-115953-marostegui.json
* 11:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89486 and previous config saved to /var/cache/conftool/dbconfig/20260302-114706-marostegui.json
* 11:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P89485 and previous config saved to /var/cache/conftool/dbconfig/20260302-114446-marostegui.json
* 11:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1193 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89484 and previous config saved to /var/cache/conftool/dbconfig/20260302-113034-marostegui.json
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
* 11:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1193.eqiad.wmnet with reason: Maintenance
* 11:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89483 and previous config saved to /var/cache/conftool/dbconfig/20260302-113010-marostegui.json
* 11:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89482 and previous config saved to /var/cache/conftool/dbconfig/20260302-112937-marostegui.json
* 11:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
* 11:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P89481 and previous config saved to /var/cache/conftool/dbconfig/20260302-111502-marostegui.json
* 11:13 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2166 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89480 and previous config saved to /var/cache/conftool/dbconfig/20260302-111351-marostegui.json
* 11:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2166.codfw.wmnet with reason: Maintenance
* 11:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89479 and previous config saved to /var/cache/conftool/dbconfig/20260302-111327-marostegui.json
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 10:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P89478 and previous config saved to /var/cache/conftool/dbconfig/20260302-105955-marostegui.json
* 10:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P89477 and previous config saved to /var/cache/conftool/dbconfig/20260302-105818-marostegui.json
* 10:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 10:55 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:54 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 10:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 10:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 10:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 10:46 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru and A:cp - 3.0 upgrade ()
* 10:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89476 and previous config saved to /var/cache/conftool/dbconfig/20260302-104446-marostegui.json
* 10:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P89475 and previous config saved to /var/cache/conftool/dbconfig/20260302-104310-marostegui.json
* 10:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89474 and previous config saved to /var/cache/conftool/dbconfig/20260302-102825-marostegui.json
* 10:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1192.eqiad.wmnet with reason: Maintenance
* 10:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89473 and previous config saved to /var/cache/conftool/dbconfig/20260302-102800-marostegui.json
* 10:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P89472 and previous config saved to /var/cache/conftool/dbconfig/20260302-101252-marostegui.json
* 10:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2164 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89471 and previous config saved to /var/cache/conftool/dbconfig/20260302-101200-marostegui.json
* 10:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2164.codfw.wmnet with reason: Maintenance
* 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89470 and previous config saved to /var/cache/conftool/dbconfig/20260302-101135-marostegui.json
* 10:08 moritzm: installing intel-microcode bugfix updates on Bookworm hosts
* 09:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P89469 and previous config saved to /var/cache/conftool/dbconfig/20260302-095744-marostegui.json
* 09:57 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru and A:cp - 3.0 upgrade ()
* 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P89468 and previous config saved to /var/cache/conftool/dbconfig/20260302-095627-marostegui.json
* 09:55 fabfur: start upgrading haproxy to 3.0 on A:cp-text_magru ([[phab:T417253|T417253]])
* 09:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89467 and previous config saved to /var/cache/conftool/dbconfig/20260302-094236-marostegui.json
* 09:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P89466 and previous config saved to /var/cache/conftool/dbconfig/20260302-094118-marostegui.json
* 09:35 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:35 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:34 moritzm: installing gnu TLS security updates
* 09:34 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:33 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89465 and previous config saved to /var/cache/conftool/dbconfig/20260302-092610-marostegui.json
* 09:26 mlitn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1245265{{!}}Limit additional whitespace to sticky header version only (T416598)]] (duration: 11m 02s)
* 09:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89464 and previous config saved to /var/cache/conftool/dbconfig/20260302-092600-marostegui.json
* 09:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1178.eqiad.wmnet with reason: Maintenance
* 09:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89463 and previous config saved to /var/cache/conftool/dbconfig/20260302-092535-marostegui.json
* 09:21 mlitn@deploy2002: mlitn: Continuing with sync
* 09:16 mlitn@deploy2002: mlitn: Backport for [[gerrit:1245265{{!}}Limit additional whitespace to sticky header version only (T416598)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:15 mlitn@deploy2002: Started scap sync-world: Backport for [[gerrit:1245265{{!}}Limit additional whitespace to sticky header version only (T416598)]]
* 09:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P89462 and previous config saved to /var/cache/conftool/dbconfig/20260302-091027-marostegui.json
* 09:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2163 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89461 and previous config saved to /var/cache/conftool/dbconfig/20260302-091003-marostegui.json
* 09:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2163.codfw.wmnet with reason: Maintenance
* 09:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89460 and previous config saved to /var/cache/conftool/dbconfig/20260302-090938-marostegui.json
* 09:08 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1246904{{!}}HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477)]] (duration: 16m 09s)
* 09:02 kharlan@deploy2002: kharlan: Continuing with sync
* 08:57 kharlan@deploy2002: kharlan: Backport for [[gerrit:1246904{{!}}HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P89459 and previous config saved to /var/cache/conftool/dbconfig/20260302-085519-marostegui.json
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P89458 and previous config saved to /var/cache/conftool/dbconfig/20260302-085430-marostegui.json
* 08:51 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1246904{{!}}HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477)]]
* 08:48 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru and A:cp - 3.0 upgrade ()
* 08:47 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1029.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:45 moritzm: installing libxml2 security updates
* 08:44 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240672{{!}}Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485)]] (duration: 37m 12s)
* 08:42 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dbproxy1029.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89457 and previous config saved to /var/cache/conftool/dbconfig/20260302-084010-marostegui.json
* 08:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P89456 and previous config saved to /var/cache/conftool/dbconfig/20260302-083922-marostegui.json
* 08:31 kgraessle@deploy2002: kgraessle: Continuing with sync
* 08:30 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1240672{{!}}Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89455 and previous config saved to /var/cache/conftool/dbconfig/20260302-082414-marostegui.json
* 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1177 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89454 and previous config saved to /var/cache/conftool/dbconfig/20260302-082333-marostegui.json
* 08:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89453 and previous config saved to /var/cache/conftool/dbconfig/20260302-082309-marostegui.json
* 08:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbproxy1028.eqiad.wmnet with reason: Maintenance
* 08:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbproxy1029.eqiad.wmnet with reason: Maintenance
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89452 and previous config saved to /var/cache/conftool/dbconfig/20260302-080813-marostegui.json
* 08:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2161.codfw.wmnet with reason: Maintenance
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P89451 and previous config saved to /var/cache/conftool/dbconfig/20260302-080800-marostegui.json
* 08:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89450 and previous config saved to /var/cache/conftool/dbconfig/20260302-080748-marostegui.json
* 08:07 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1240672{{!}}Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485)]]
* 08:05 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru and A:cp - 3.0 upgrade ()
* 08:05 fabfur: start upgrading haproxy to 3.0 on A:cp-upload_magru ([[phab:T417253|T417253]])
* 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P89449 and previous config saved to /var/cache/conftool/dbconfig/20260302-075252-marostegui.json
* 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P89448 and previous config saved to /var/cache/conftool/dbconfig/20260302-075241-marostegui.json
* 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89447 and previous config saved to /var/cache/conftool/dbconfig/20260302-073745-marostegui.json
* 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P89446 and previous config saved to /var/cache/conftool/dbconfig/20260302-073732-marostegui.json
* 07:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89445 and previous config saved to /var/cache/conftool/dbconfig/20260302-072224-marostegui.json
* 07:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89444 and previous config saved to /var/cache/conftool/dbconfig/20260302-072058-marostegui.json
* 07:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 07:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89443 and previous config saved to /var/cache/conftool/dbconfig/20260302-070523-marostegui.json
* 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2154 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89442 and previous config saved to /var/cache/conftool/dbconfig/20260302-070512-marostegui.json
* 07:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2154.codfw.wmnet with reason: Maintenance
* 07:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89441 and previous config saved to /var/cache/conftool/dbconfig/20260302-070447-marostegui.json
* 07:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1244: After schema change
* 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P89439 and previous config saved to /var/cache/conftool/dbconfig/20260302-065014-marostegui.json
* 06:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P89438 and previous config saved to /var/cache/conftool/dbconfig/20260302-064938-marostegui.json
* 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P89436 and previous config saved to /var/cache/conftool/dbconfig/20260302-063506-marostegui.json
* 06:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P89435 and previous config saved to /var/cache/conftool/dbconfig/20260302-063430-marostegui.json
* 06:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89433 and previous config saved to /var/cache/conftool/dbconfig/20260302-061957-marostegui.json
* 06:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89432 and previous config saved to /var/cache/conftool/dbconfig/20260302-061922-marostegui.json
* 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: Maintenance
* 06:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1244: After schema change
* 06:15 marostegui@dns1004: END - running authdns-update
* 06:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2240 [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89430 and previous config saved to /var/cache/conftool/dbconfig/20260302-061428-marostegui.json
* 06:13 marostegui@dns1004: START - running authdns-update
* 06:13 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2179 to s4 primary and set section read-write [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89429 and previous config saved to /var/cache/conftool/dbconfig/20260302-061316-marostegui.json
* 06:12 marostegui@cumin1003: dbctl commit (dc=all): 'Set s4 codfw as read-only for maintenance - [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89428 and previous config saved to /var/cache/conftool/dbconfig/20260302-061252-marostegui.json
* 06:06 marostegui: Starting s4 codfw failover from db2240 to db2179 - [[phab:T418080|T418080]]
* 06:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 42 hosts with reason: Primary switchover s4 [[phab:T418080|T418080]]
* 06:03 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2179 with weight 0 [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89427 and previous config saved to /var/cache/conftool/dbconfig/20260302-060317-marostegui.json
* 06:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89426 and previous config saved to /var/cache/conftool/dbconfig/20260302-060317-marostegui.json
* 06:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 06:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 06:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2152 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89425 and previous config saved to /var/cache/conftool/dbconfig/20260302-060245-marostegui.json
* 06:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2152.codfw.wmnet with reason: Maintenance
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2220.codfw.wmnet with reason: Maintenance
* 02:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 13s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 00:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 00:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89424 and previous config saved to /var/cache/conftool/dbconfig/20260302-004950-marostegui.json
* 00:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P89423 and previous config saved to /var/cache/conftool/dbconfig/20260302-003441-marostegui.json
* 00:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P89422 and previous config saved to /var/cache/conftool/dbconfig/20260302-001933-marostegui.json
* 00:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89421 and previous config saved to /var/cache/conftool/dbconfig/20260302-000425-marostegui.json
* 00:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1253 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89420 and previous config saved to /var/cache/conftool/dbconfig/20260302-000208-marostegui.json
* 00:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1253.eqiad.wmnet with reason: Maintenance
* 00:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89419 and previous config saved to /var/cache/conftool/dbconfig/20260302-000143-marostegui.json
== 2026-03-01 ==
* 23:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P89418 and previous config saved to /var/cache/conftool/dbconfig/20260301-234635-marostegui.json
* 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89417 and previous config saved to /var/cache/conftool/dbconfig/20260301-233524-marostegui.json
* 23:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P89416 and previous config saved to /var/cache/conftool/dbconfig/20260301-233127-marostegui.json
* 23:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P89415 and previous config saved to /var/cache/conftool/dbconfig/20260301-232016-marostegui.json
* 23:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89414 and previous config saved to /var/cache/conftool/dbconfig/20260301-231619-marostegui.json
* 23:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89413 and previous config saved to /var/cache/conftool/dbconfig/20260301-231404-marostegui.json
* 23:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1236.eqiad.wmnet with reason: Maintenance
* 23:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89412 and previous config saved to /var/cache/conftool/dbconfig/20260301-231339-marostegui.json
* 23:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P89411 and previous config saved to /var/cache/conftool/dbconfig/20260301-230508-marostegui.json
* 22:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P89410 and previous config saved to /var/cache/conftool/dbconfig/20260301-225832-marostegui.json
* 22:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89409 and previous config saved to /var/cache/conftool/dbconfig/20260301-224959-marostegui.json
* 22:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2222 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89408 and previous config saved to /var/cache/conftool/dbconfig/20260301-224451-marostegui.json
* 22:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2222.codfw.wmnet with reason: Maintenance
* 22:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89407 and previous config saved to /var/cache/conftool/dbconfig/20260301-224426-marostegui.json
* 22:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P89406 and previous config saved to /var/cache/conftool/dbconfig/20260301-224324-marostegui.json
* 22:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P89405 and previous config saved to /var/cache/conftool/dbconfig/20260301-222919-marostegui.json
* 22:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89404 and previous config saved to /var/cache/conftool/dbconfig/20260301-222815-marostegui.json
* 22:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1231 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89403 and previous config saved to /var/cache/conftool/dbconfig/20260301-222600-marostegui.json
* 22:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1231.eqiad.wmnet with reason: Maintenance
* 22:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89402 and previous config saved to /var/cache/conftool/dbconfig/20260301-222536-marostegui.json
* 22:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P89401 and previous config saved to /var/cache/conftool/dbconfig/20260301-221410-marostegui.json
* 22:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P89400 and previous config saved to /var/cache/conftool/dbconfig/20260301-221027-marostegui.json
* 21:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89399 and previous config saved to /var/cache/conftool/dbconfig/20260301-215902-marostegui.json
* 21:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P89398 and previous config saved to /var/cache/conftool/dbconfig/20260301-215519-marostegui.json
* 21:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89397 and previous config saved to /var/cache/conftool/dbconfig/20260301-215404-marostegui.json
* 21:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2221.codfw.wmnet with reason: Maintenance
* 21:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89396 and previous config saved to /var/cache/conftool/dbconfig/20260301-215339-marostegui.json
* 21:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89395 and previous config saved to /var/cache/conftool/dbconfig/20260301-214011-marostegui.json
* 21:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P89394 and previous config saved to /var/cache/conftool/dbconfig/20260301-213831-marostegui.json
* 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89393 and previous config saved to /var/cache/conftool/dbconfig/20260301-213410-marostegui.json
* 21:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 21:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89392 and previous config saved to /var/cache/conftool/dbconfig/20260301-213346-marostegui.json
* 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P89391 and previous config saved to /var/cache/conftool/dbconfig/20260301-212323-marostegui.json
* 21:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P89390 and previous config saved to /var/cache/conftool/dbconfig/20260301-211837-marostegui.json
* 21:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89389 and previous config saved to /var/cache/conftool/dbconfig/20260301-210815-marostegui.json
* 21:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P89388 and previous config saved to /var/cache/conftool/dbconfig/20260301-210329-marostegui.json
* 21:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89387 and previous config saved to /var/cache/conftool/dbconfig/20260301-210309-marostegui.json
* 21:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2218.codfw.wmnet with reason: Maintenance
* 21:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89386 and previous config saved to /var/cache/conftool/dbconfig/20260301-210244-marostegui.json
* 20:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89385 and previous config saved to /var/cache/conftool/dbconfig/20260301-204820-marostegui.json
* 20:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P89384 and previous config saved to /var/cache/conftool/dbconfig/20260301-204736-marostegui.json
* 20:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1202 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89383 and previous config saved to /var/cache/conftool/dbconfig/20260301-204606-marostegui.json
* 20:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 20:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89382 and previous config saved to /var/cache/conftool/dbconfig/20260301-204541-marostegui.json
* 20:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P89381 and previous config saved to /var/cache/conftool/dbconfig/20260301-203227-marostegui.json
* 20:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P89380 and previous config saved to /var/cache/conftool/dbconfig/20260301-203033-marostegui.json
* 20:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89379 and previous config saved to /var/cache/conftool/dbconfig/20260301-201720-marostegui.json
* 20:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P89378 and previous config saved to /var/cache/conftool/dbconfig/20260301-201525-marostegui.json
* 20:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2208 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89377 and previous config saved to /var/cache/conftool/dbconfig/20260301-201212-marostegui.json
* 20:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Maintenance
* 20:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2200.codfw.wmnet with reason: Maintenance
* 20:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2198.codfw.wmnet with reason: Maintenance
* 20:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89376 and previous config saved to /var/cache/conftool/dbconfig/20260301-200422-marostegui.json
* 20:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89375 and previous config saved to /var/cache/conftool/dbconfig/20260301-200016-marostegui.json
* 19:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1194 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89374 and previous config saved to /var/cache/conftool/dbconfig/20260301-195803-marostegui.json
* 19:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 19:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89373 and previous config saved to /var/cache/conftool/dbconfig/20260301-195738-marostegui.json
* 19:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P89372 and previous config saved to /var/cache/conftool/dbconfig/20260301-194914-marostegui.json
* 19:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P89371 and previous config saved to /var/cache/conftool/dbconfig/20260301-194230-marostegui.json
* 19:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P89370 and previous config saved to /var/cache/conftool/dbconfig/20260301-193406-marostegui.json
* 19:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P89369 and previous config saved to /var/cache/conftool/dbconfig/20260301-192721-marostegui.json
* 19:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89368 and previous config saved to /var/cache/conftool/dbconfig/20260301-191858-marostegui.json
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89367 and previous config saved to /var/cache/conftool/dbconfig/20260301-191340-marostegui.json
* 19:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89366 and previous config saved to /var/cache/conftool/dbconfig/20260301-191315-marostegui.json
* 19:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89365 and previous config saved to /var/cache/conftool/dbconfig/20260301-191213-marostegui.json
* 19:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1191 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89364 and previous config saved to /var/cache/conftool/dbconfig/20260301-190958-marostegui.json
* 19:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 19:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89363 and previous config saved to /var/cache/conftool/dbconfig/20260301-190934-marostegui.json
* 18:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P89362 and previous config saved to /var/cache/conftool/dbconfig/20260301-185807-marostegui.json
* 18:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P89361 and previous config saved to /var/cache/conftool/dbconfig/20260301-185425-marostegui.json
* 18:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P89360 and previous config saved to /var/cache/conftool/dbconfig/20260301-184259-marostegui.json
* 18:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P89359 and previous config saved to /var/cache/conftool/dbconfig/20260301-183917-marostegui.json
* 18:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89358 and previous config saved to /var/cache/conftool/dbconfig/20260301-182750-marostegui.json
* 18:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89357 and previous config saved to /var/cache/conftool/dbconfig/20260301-182409-marostegui.json
* 18:22 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2168 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89356 and previous config saved to /var/cache/conftool/dbconfig/20260301-182238-marostegui.json
* 18:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 18:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89355 and previous config saved to /var/cache/conftool/dbconfig/20260301-182213-marostegui.json
* 18:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89354 and previous config saved to /var/cache/conftool/dbconfig/20260301-182153-marostegui.json
* 18:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 18:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 18:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89353 and previous config saved to /var/cache/conftool/dbconfig/20260301-181818-marostegui.json
* 18:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P89352 and previous config saved to /var/cache/conftool/dbconfig/20260301-180705-marostegui.json
* 18:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P89351 and previous config saved to /var/cache/conftool/dbconfig/20260301-180310-marostegui.json
* 17:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P89350 and previous config saved to /var/cache/conftool/dbconfig/20260301-175157-marostegui.json
* 17:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P89349 and previous config saved to /var/cache/conftool/dbconfig/20260301-174802-marostegui.json
* 17:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89348 and previous config saved to /var/cache/conftool/dbconfig/20260301-173649-marostegui.json
* 17:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89347 and previous config saved to /var/cache/conftool/dbconfig/20260301-173253-marostegui.json
* 17:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89346 and previous config saved to /var/cache/conftool/dbconfig/20260301-173134-marostegui.json
* 17:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 17:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89345 and previous config saved to /var/cache/conftool/dbconfig/20260301-173110-marostegui.json
* 17:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89344 and previous config saved to /var/cache/conftool/dbconfig/20260301-172742-marostegui.json
* 17:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 17:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89343 and previous config saved to /var/cache/conftool/dbconfig/20260301-172717-marostegui.json
* 17:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P89342 and previous config saved to /var/cache/conftool/dbconfig/20260301-171602-marostegui.json
* 17:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P89341 and previous config saved to /var/cache/conftool/dbconfig/20260301-171210-marostegui.json
* 17:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P89340 and previous config saved to /var/cache/conftool/dbconfig/20260301-170053-marostegui.json
* 16:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P89339 and previous config saved to /var/cache/conftool/dbconfig/20260301-165701-marostegui.json
* 16:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89338 and previous config saved to /var/cache/conftool/dbconfig/20260301-164545-marostegui.json
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89337 and previous config saved to /var/cache/conftool/dbconfig/20260301-164153-marostegui.json
* 16:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2150 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89336 and previous config saved to /var/cache/conftool/dbconfig/20260301-164022-marostegui.json
* 16:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2150.codfw.wmnet with reason: Maintenance
* 16:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89335 and previous config saved to /var/cache/conftool/dbconfig/20260301-163938-marostegui.json
* 16:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 16:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 16:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1230.eqiad.wmnet with reason: Maintenance
* 16:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 12:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89334 and previous config saved to /var/cache/conftool/dbconfig/20260301-122201-marostegui.json
* 12:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P89333 and previous config saved to /var/cache/conftool/dbconfig/20260301-120652-marostegui.json
* 11:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P89332 and previous config saved to /var/cache/conftool/dbconfig/20260301-115144-marostegui.json
* 11:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89331 and previous config saved to /var/cache/conftool/dbconfig/20260301-113636-marostegui.json
* 11:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2228 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89330 and previous config saved to /var/cache/conftool/dbconfig/20260301-113156-marostegui.json
* 11:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2228.codfw.wmnet with reason: Maintenance
* 11:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89329 and previous config saved to /var/cache/conftool/dbconfig/20260301-113131-marostegui.json
* 11:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 11:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 11:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1216.eqiad.wmnet with reason: Maintenance
* 11:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89328 and previous config saved to /var/cache/conftool/dbconfig/20260301-111658-marostegui.json
* 11:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P89327 and previous config saved to /var/cache/conftool/dbconfig/20260301-111622-marostegui.json
* 11:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P89326 and previous config saved to /var/cache/conftool/dbconfig/20260301-110151-marostegui.json
* 11:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P89325 and previous config saved to /var/cache/conftool/dbconfig/20260301-110114-marostegui.json
* 10:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P89324 and previous config saved to /var/cache/conftool/dbconfig/20260301-104642-marostegui.json
* 10:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89323 and previous config saved to /var/cache/conftool/dbconfig/20260301-104606-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2223 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89322 and previous config saved to /var/cache/conftool/dbconfig/20260301-104024-marostegui.json
* 10:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2223.codfw.wmnet with reason: Maintenance
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89321 and previous config saved to /var/cache/conftool/dbconfig/20260301-103958-marostegui.json
* 10:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89320 and previous config saved to /var/cache/conftool/dbconfig/20260301-103134-marostegui.json
* 10:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89319 and previous config saved to /var/cache/conftool/dbconfig/20260301-102727-marostegui.json
* 10:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1210.eqiad.wmnet with reason: Maintenance
* 10:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89318 and previous config saved to /var/cache/conftool/dbconfig/20260301-102702-marostegui.json
* 10:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P89317 and previous config saved to /var/cache/conftool/dbconfig/20260301-102450-marostegui.json
* 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P89316 and previous config saved to /var/cache/conftool/dbconfig/20260301-101154-marostegui.json
* 10:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P89315 and previous config saved to /var/cache/conftool/dbconfig/20260301-100942-marostegui.json
* 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P89314 and previous config saved to /var/cache/conftool/dbconfig/20260301-095645-marostegui.json
* 09:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89313 and previous config saved to /var/cache/conftool/dbconfig/20260301-095434-marostegui.json
* 09:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2211 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89312 and previous config saved to /var/cache/conftool/dbconfig/20260301-094847-marostegui.json
* 09:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2211.codfw.wmnet with reason: Maintenance
* 09:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2201.codfw.wmnet with reason: Maintenance
* 09:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89311 and previous config saved to /var/cache/conftool/dbconfig/20260301-094432-marostegui.json
* 09:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89310 and previous config saved to /var/cache/conftool/dbconfig/20260301-094137-marostegui.json
* 09:38 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1207 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89309 and previous config saved to /var/cache/conftool/dbconfig/20260301-093835-marostegui.json
* 09:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1207.eqiad.wmnet with reason: Maintenance
* 09:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89308 and previous config saved to /var/cache/conftool/dbconfig/20260301-093810-marostegui.json
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P89307 and previous config saved to /var/cache/conftool/dbconfig/20260301-092923-marostegui.json
* 09:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P89306 and previous config saved to /var/cache/conftool/dbconfig/20260301-092302-marostegui.json
* 09:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P89305 and previous config saved to /var/cache/conftool/dbconfig/20260301-091415-marostegui.json
* 09:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P89304 and previous config saved to /var/cache/conftool/dbconfig/20260301-090754-marostegui.json
* 08:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89303 and previous config saved to /var/cache/conftool/dbconfig/20260301-085907-marostegui.json
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89302 and previous config saved to /var/cache/conftool/dbconfig/20260301-085427-marostegui.json
* 08:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2192.codfw.wmnet with reason: Maintenance
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89301 and previous config saved to /var/cache/conftool/dbconfig/20260301-085403-marostegui.json
* 08:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89300 and previous config saved to /var/cache/conftool/dbconfig/20260301-085246-marostegui.json
* 08:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1200 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89299 and previous config saved to /var/cache/conftool/dbconfig/20260301-084952-marostegui.json
* 08:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1200.eqiad.wmnet with reason: Maintenance
* 08:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89298 and previous config saved to /var/cache/conftool/dbconfig/20260301-084928-marostegui.json
* 08:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P89297 and previous config saved to /var/cache/conftool/dbconfig/20260301-083855-marostegui.json
* 08:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P89296 and previous config saved to /var/cache/conftool/dbconfig/20260301-083420-marostegui.json
* 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P89295 and previous config saved to /var/cache/conftool/dbconfig/20260301-082346-marostegui.json
* 08:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P89294 and previous config saved to /var/cache/conftool/dbconfig/20260301-081912-marostegui.json
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89293 and previous config saved to /var/cache/conftool/dbconfig/20260301-080838-marostegui.json
* 08:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89292 and previous config saved to /var/cache/conftool/dbconfig/20260301-080404-marostegui.json
* 08:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 08:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89291 and previous config saved to /var/cache/conftool/dbconfig/20260301-080341-marostegui.json
* 08:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1185 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89290 and previous config saved to /var/cache/conftool/dbconfig/20260301-080110-marostegui.json
* 08:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1185.eqiad.wmnet with reason: Maintenance
* 08:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89289 and previous config saved to /var/cache/conftool/dbconfig/20260301-080044-marostegui.json
* 07:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P89288 and previous config saved to /var/cache/conftool/dbconfig/20260301-074833-marostegui.json
* 07:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P89287 and previous config saved to /var/cache/conftool/dbconfig/20260301-074536-marostegui.json
* 07:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P89286 and previous config saved to /var/cache/conftool/dbconfig/20260301-073324-marostegui.json
* 07:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P89285 and previous config saved to /var/cache/conftool/dbconfig/20260301-073028-marostegui.json
* 07:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89284 and previous config saved to /var/cache/conftool/dbconfig/20260301-071816-marostegui.json
* 07:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89283 and previous config saved to /var/cache/conftool/dbconfig/20260301-071521-marostegui.json
* 07:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2171 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89282 and previous config saved to /var/cache/conftool/dbconfig/20260301-071226-marostegui.json
* 07:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 07:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89281 and previous config saved to /var/cache/conftool/dbconfig/20260301-071201-marostegui.json
* 07:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89280 and previous config saved to /var/cache/conftool/dbconfig/20260301-071113-marostegui.json
* 07:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89279 and previous config saved to /var/cache/conftool/dbconfig/20260301-071040-marostegui.json
* 06:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P89278 and previous config saved to /var/cache/conftool/dbconfig/20260301-065653-marostegui.json
* 06:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P89277 and previous config saved to /var/cache/conftool/dbconfig/20260301-065531-marostegui.json
* 06:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P89276 and previous config saved to /var/cache/conftool/dbconfig/20260301-064145-marostegui.json
* 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P89275 and previous config saved to /var/cache/conftool/dbconfig/20260301-064023-marostegui.json
* 06:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89274 and previous config saved to /var/cache/conftool/dbconfig/20260301-062636-marostegui.json
* 06:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89273 and previous config saved to /var/cache/conftool/dbconfig/20260301-062515-marostegui.json
* 06:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89272 and previous config saved to /var/cache/conftool/dbconfig/20260301-062108-marostegui.json
* 06:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1159.eqiad.wmnet with reason: Maintenance
* 06:20 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89271 and previous config saved to /var/cache/conftool/dbconfig/20260301-062047-marostegui.json
* 06:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
* 02:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 00s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
aj2i2dw7mcgldd06bwjjx6zl56reof1
2396608
2396605
2026-03-29T02:00:48Z
Stashbot
7414
mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
2396608
wikitext
text/x-wiki
== 2026-03-29 ==
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-03-28 ==
* 14:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 14:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 14:16 mutante: releases1003 - re-enabled puppet which was disabled due to [[phab:T418109|T418109]] but should not have been disabled during switch of the deployment server; leading to [[phab:T421532|T421532]]
== 2026-03-27 ==
* 18:11 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 18:00 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:50 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:40 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:39 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:39 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:39 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 17:38 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:37 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 17:37 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:35 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 17:34 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 17:34 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:30 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:30 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:24 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:19 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:15 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:04 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:55 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:50 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:47 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:42 dancy@deploy1003: Finished deploy [releng/jenkins-deploy@31ace7e] (releasing): (no justification provided) (duration: 01m 18s)
* 16:41 dancy@deploy1003: Started deploy [releng/jenkins-deploy@31ace7e] (releasing): (no justification provided)
* 16:37 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:36 jhathaway@cumin2002: END (ERROR) - Cookbook sre.hosts.provision (exit_code=97) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:27 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:22 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:13 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 16:12 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 16:12 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:11 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:10 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 15:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:00 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:09 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:09 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change ips for frack servers - cmooney@cumin1003"
* 14:08 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change ips for frack servers - cmooney@cumin1003"
* 14:02 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 13:52 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:51 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:49 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:49 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:48 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:47 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:47 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:40 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:11 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:08 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:06 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 11:53 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 11:51 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 11:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 11:48 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 11:30 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:27 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:15 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-test1006.eqiad.wmnet with OS trixie
* 11:15 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database abstractwiki ([[phab:T420637|T420637]])
* 11:02 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2002.codfw.wmnet
* 10:54 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2006.codfw.wmnet
* 10:51 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 10:50 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2006.codfw.wmnet
* 10:46 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2005.codfw.wmnet
* 10:43 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2005.codfw.wmnet
* 10:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1006.eqiad.wmnet with reason: host reimage
* 10:27 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1006.eqiad.wmnet with reason: host reimage
* 10:18 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database abstractwiki ([[phab:T420637|T420637]])
* 10:12 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1006.eqiad.wmnet with OS trixie
* 10:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 10:03 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:01 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:58 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:57 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
* 09:37 elukey@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
* 09:06 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 09:05 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:04 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:03 elukey@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 08:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:05 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 08:04 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 08:02 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 07:46 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 03:06 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:32 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:12 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 07s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:30 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 01:30 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul1001.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 01:30 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul2001.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 01:29 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 01:12 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
== 2026-03-26 ==
* 21:35 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1260834{{!}}Add Logstash logging for successful passwordless logins]], [[gerrit:1261511{{!}}InitialiseSettings: Remove apiportalwiki from $wmgCentralAuthAutoLoginWikis (T421413)]] (duration: 06m 58s)
* 21:31 reedy@deploy1003: catrope, reedy: Continuing with sync
* 21:30 reedy@deploy1003: catrope, reedy: Backport for [[gerrit:1260834{{!}}Add Logstash logging for successful passwordless logins]], [[gerrit:1261511{{!}}InitialiseSettings: Remove apiportalwiki from $wmgCentralAuthAutoLoginWikis (T421413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:28 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1260834{{!}}Add Logstash logging for successful passwordless logins]], [[gerrit:1261511{{!}}InitialiseSettings: Remove apiportalwiki from $wmgCentralAuthAutoLoginWikis (T421413)]]
* 21:00 suecarmol@deploy1003: Finished scap sync-world: Backport for [[gerrit:1256498{{!}}PersonalDashboard: Add config for Active Discussions (T420785)]] (duration: 13m 53s)
* 20:54 suecarmol@deploy1003: suecarmol: Continuing with sync
* 20:51 suecarmol@deploy1003: suecarmol: Backport for [[gerrit:1256498{{!}}PersonalDashboard: Add config for Active Discussions (T420785)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:46 suecarmol@deploy1003: Started scap sync-world: Backport for [[gerrit:1256498{{!}}PersonalDashboard: Add config for Active Discussions (T420785)]]
* 20:44 kamila@deploy1003: Finished scap sync-world: Backport for [[gerrit:1261545{{!}}Wrap 'centralauthtoken' in a JWT (T420280)]], [[gerrit:1261470{{!}}Enable $wgTempCategoryCollations for testwiki. (T419274 T419049)]] (duration: 37m 32s)
* 20:30 kamila@deploy1003: matmarex, kamila: Continuing with sync
* 20:25 kamila@deploy1003: matmarex, kamila: Backport for [[gerrit:1261545{{!}}Wrap 'centralauthtoken' in a JWT (T420280)]], [[gerrit:1261470{{!}}Enable $wgTempCategoryCollations for testwiki. (T419274 T419049)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host restbase2039.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:09 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host restbase2039.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>aqs1015.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 20:08 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host restbase2039
* 20:07 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host restbase2039
* 20:06 kamila@deploy1003: Started scap sync-world: Backport for [[gerrit:1261545{{!}}Wrap 'centralauthtoken' in a JWT (T420280)]], [[gerrit:1261470{{!}}Enable $wgTempCategoryCollations for testwiki. (T419274 T419049)]]
* 20:05 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:05 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding restbase2039 to codfw - jhancock@cumin2002"
* 20:05 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding restbase2039 to codfw - jhancock@cumin2002"
* 20:02 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 20:00 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>aqs1015.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:47 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 19:44 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 18:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:48 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:42 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
* 18:42 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/zotero: apply
* 18:42 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
* 18:41 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
* 18:41 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
* 18:40 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/toolhub: apply
* 18:40 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
* 18:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:39 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/termbox: apply
* 18:39 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 18:37 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 18:37 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 18:37 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 18:36 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 18:36 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:36 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 18:36 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
* 18:35 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
* 18:35 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/recommendation-api: apply
* 18:35 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/recommendation-api: apply
* 18:34 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
* 18:34 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
* 18:33 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/push-notifications: apply
* 18:33 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/push-notifications: apply
* 18:32 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 18:31 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 18:31 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 18:28 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 18:28 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:28 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:27 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
* 18:25 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>sessionstore1006.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 18:21 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
* 18:21 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 18:20 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 18:20 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
* 18:19 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/ipoid: apply
* 18:19 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/image-suggestion: apply
* 18:19 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/image-suggestion: apply
* 18:18 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>sessionstore1006.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 18:18 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
* 18:17 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
* 18:17 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
* 18:16 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
* 18:15 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 18:14 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 18:13 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 18:13 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 18:13 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
* 18:12 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
* 18:12 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:12 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
* 18:10 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
* 18:09 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
* 18:09 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/echostore: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/echostore: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 18:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 18:06 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/commons-impact-analytics: apply
* 18:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/commons-impact-analytics: apply
* 18:04 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 18:04 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 18:03 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 18:03 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 18:03 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
* 18:02 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
* 17:59 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/apertium: apply
* 17:58 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/apertium: apply
* 17:55 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to enable envoy drain on remaining services - [[phab:T364245|T364245]] (duration: 05m 31s)
* 17:52 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to enable envoy drain on remaining services - [[phab:T364245|T364245]]
* 17:35 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:39 rzl@deploy1003: Finished scap sync-world: https://gerrit.wikimedia.org/r/1256396 [[phab:T420666|T420666]] (duration: 11m 21s)
* 16:35 rzl@deploy1003: rzl: Continuing with sync
* 16:34 rzl@deploy1003: rzl: https://gerrit.wikimedia.org/r/1256396 [[phab:T420666|T420666]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:31 rzl@deploy1003: Started scap sync-world: https://gerrit.wikimedia.org/r/1256396 [[phab:T420666|T420666]]
* 16:27 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 16:17 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 16:17 blake@deploy1003: Finished scap sync-world: Test deployment to validate deployment server switchover - [[phab:T413974|T413974]] (duration: 31m 09s)
* 16:16 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 16:05 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 15:47 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1202.eqiad.wmnet onto db1253.eqiad.wmnet
* 15:47 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1253: Pool db1253.eqiad.wmnet in after cloning
* 15:46 blake@deploy1003: Started scap sync-world: Test deployment to validate deployment server switchover - [[phab:T413974|T413974]]
* 15:44 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 15:43 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 15:43 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 15:33 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 15:30 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
* 15:30 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
* 15:23 blake@dns1004: END - running authdns-update
* 15:22 bjensen: updating dns for the deployment host switchover
* 15:21 blake@dns1004: START - running authdns-update
* 15:19 blake@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet,releases1003.eqiad.wmnet with reason: Deployment server switchover
* 15:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:01 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1253: Pool db1253.eqiad.wmnet in after cloning
* 14:39 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:28 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:23 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:22 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:21 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:21 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:21 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:20 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 14:20 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 14:20 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 14:19 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 14:18 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 14:18 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 14:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1202: Pool db1202.eqiad.wmnet in after cloning
* 13:57 jynus: dropping ms-backup[12]00[12] grants from backup1-* dbs [[phab:T420464|T420464]]
* 13:56 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1070.eqiad.wmnet
* 13:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1070.eqiad.wmnet
* 13:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1097.eqiad.wmnet
* 13:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1097.eqiad.wmnet
* 13:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1055.eqiad.wmnet
* 13:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1055.eqiad.wmnet
* 13:46 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:45 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:40 sergi0: UTC afternoon backport window done
* 13:39 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259132{{!}}GrowthExperiments: scale edit and thanks query limit to more wikis (T341599)]] (duration: 09m 17s)
* 13:35 sgimeno@deploy2002: sgimeno: Continuing with sync
* 13:32 sgimeno@deploy2002: sgimeno: Backport for [[gerrit:1259132{{!}}GrowthExperiments: scale edit and thanks query limit to more wikis (T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1259132{{!}}GrowthExperiments: scale edit and thanks query limit to more wikis (T341599)]]
* 13:26 jforrester@deploy2002: Finished deploy [integration/docroot@f021d3f]: {{Gerrit|Ia936ecd68e675cff2925dba933e3b67b9bad4cd6}} (duration: 00m 11s)
* 13:26 jforrester@deploy2002: Started deploy [integration/docroot@f021d3f]: {{Gerrit|Ia936ecd68e675cff2925dba933e3b67b9bad4cd6}}
* 13:24 kamila@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256384{{!}}Temporarily add shellbox-icu to $wgShellboxUrls (T419049 T419242 T419274)]] (duration: 07m 16s)
* 13:20 kamila@deploy2002: kamila: Continuing with sync
* 13:19 kamila@deploy2002: kamila: Backport for [[gerrit:1256384{{!}}Temporarily add shellbox-icu to $wgShellboxUrls (T419049 T419242 T419274)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:17 kamila@deploy2002: Started scap sync-world: Backport for [[gerrit:1256384{{!}}Temporarily add shellbox-icu to $wgShellboxUrls (T419049 T419242 T419274)]]
* 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1202: Pool db1202.eqiad.wmnet in after cloning
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 13:13 kamila@deploy2002: Finished scap sync-world: Backport for [[gerrit:1261420{{!}}cswiki: lift IP cap for editathon (T421305)]] (duration: 07m 22s)
* 13:12 btullis@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 13:09 kamila@deploy2002: kamila, anzx: Continuing with sync
* 13:08 jynus: deploying new grants for new ms-backup hosts and removing old ones [[phab:T420464|T420464]]
* 13:08 kamila@deploy2002: kamila, anzx: Backport for [[gerrit:1261420{{!}}cswiki: lift IP cap for editathon (T421305)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:06 kamila@deploy2002: Started scap sync-world: Backport for [[gerrit:1261420{{!}}cswiki: lift IP cap for editathon (T421305)]]
* 13:03 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:43 cdanis: puppet reenabled on drmrs, CIDERGRINDER deployed
* 12:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:23 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:12 cdanis: 💔cdanis@cumin1003.eqiad.wmnet ~ 🕗☕ sudo cumin 'A:cp-drmrs' 'disable-puppet "cdanis CIDER"'
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1004.eqiad.wmnet
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1006.eqiad.wmnet
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1003.eqiad.wmnet
* 12:02 elukey@cumin1003: END (PASS) - Cookbook sre.postgresql.postgres-init (exit_code=0)
* 12:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1002.eqiad.wmnet
* 12:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1005.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1006.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1005.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1004.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1003.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1002.eqiad.wmnet
* 11:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1001.eqiad.wmnet
* 11:44 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:43 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:43 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:41 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:41 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 11:41 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 11:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1001.eqiad.wmnet
* 11:38 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:37 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 11:33 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1202: Depool db1202.eqiad.wmnet to then clone it to db1253.eqiad.wmnet - fceratto@cumin1003
* 11:31 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 11:31 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1202: Depool db1202.eqiad.wmnet to then clone it to db1253.eqiad.wmnet - fceratto@cumin1003
* 11:31 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db1202.eqiad.wmnet onto db1253.eqiad.wmnet
* 11:31 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 11:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1001.eqiad.wmnet
* 11:22 elukey@cumin1003: START - Cookbook sre.postgresql.postgres-init
* 11:22 elukey@cumin1003: END (FAIL) - Cookbook sre.postgresql.postgres-init (exit_code=99)
* 11:22 elukey@cumin1003: START - Cookbook sre.postgresql.postgres-init
* 11:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1001.eqiad.wmnet
* 11:15 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
* 11:14 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
* 11:14 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 11:13 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 11:07 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 11:04 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260802{{!}}SI: Enable on bnwiki, itwiki, simplewiki, and plwiki (T415529)]] (duration: 09m 23s)
* 10:59 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 10:56 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1260802{{!}}SI: Enable on bnwiki, itwiki, simplewiki, and plwiki (T415529)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:54 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1260802{{!}}SI: Enable on bnwiki, itwiki, simplewiki, and plwiki (T415529)]]
* 10:33 oblivian@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: sync
* 10:32 oblivian@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: sync
* 10:32 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 10:32 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:23 tappof@cumin1003: END (PASS) - Cookbook sre.o11y.thanos-compact-restart (exit_code=0)
* 10:23 tappof@cumin1003: START - Cookbook sre.o11y.thanos-compact-restart
* 10:22 tappof@cumin1003: END (PASS) - Cookbook sre.o11y.thanos-compact-restart (exit_code=0)
* 10:22 tappof@cumin1003: START - Cookbook sre.o11y.thanos-compact-restart
* 10:12 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s1
* 10:11 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s1
* 10:05 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s4
* 10:05 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s4
* 09:58 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s8
* 09:58 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s8
* 09:53 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 09:53 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:52 hashar: Starting Gerrit on the replica / gerrit1003
* 09:51 hashar: Stopping Gerrit on the replica / gerrit1003 to clear web sessions
* 09:51 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s7
* 09:50 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s7
* 09:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 09:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 09:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 09:46 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 09:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 09:44 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s3
* 09:43 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s3
* 09:42 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 09:36 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s2
* 09:36 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s2
* 09:31 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:29 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s5
* 09:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:29 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s5
* 09:26 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 09:22 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s6
* 09:22 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 09:22 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s6
* 09:18 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:16 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section es6
* 09:15 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section es6
* 09:13 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 09:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 09:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 09:08 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section x3
* 09:07 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section x3
* 09:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:02 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section x1
* 09:01 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section x1
* 09:01 federico3: starting [[phab:T416708|T416708]] - disabling circular replication on core dbs
* 08:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 08:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 08:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 08:43 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:41 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:32 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 08:27 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 08:18 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 08:11 hashar@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 05:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Downgrade clouddb1022 to 10.11.13
== 2026-03-25 ==
* 23:59 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul2001.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 23:58 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul1001.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 23:29 mutante: zuul1001 - installed mariadb-client - connected once to zuul db on m1-master; mysql> truncate "alembic_version"; - systemctl restart zuul-web - This fixed the zuul-web service. finally no error in systemctl status. ([[phab:T405119|T405119]])
* 21:38 ryankemper: [opensearch-k8s] [[phab:T414484|T414484]] Depooled eqiad; change verified working (now when I do `host k8s-ingress-dse-aa.discovery.wmnet` from `cumin1003`, and then reverse-lookup the resulting IP, I get a codfw address); so traffic is now routing to dse-k8s-codfw
* 21:35 ryankemper@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 21:30 Dreamy_Jazz: Created cusi_case, cusi_user, and cusi_signal on bnwiki, itwiki, simplewiki, plwiki for [[phab:T415529|T415529]]
* 21:27 ryankemper: [opensearch-k8s] [[phab:T414484|T414484]] Getting ready to depool `dnsdisc=k8s-ingress-dse-aa,name=eqiad`, leaving codfw pooled. This will get us ready for a full rolling-upgrade of the dse-k8s-eqiad cluster tomorrow.
* 21:23 Dreamy_Jazz: Evening UTC backport window done
* 21:08 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260797{{!}}SuggestedInvestigations: Import session into signal matching job (T421062)]] (duration: 10m 26s)
* 21:04 kharlan@deploy2002: kharlan: Continuing with sync
* 21:01 kharlan@deploy2002: kharlan: Backport for [[gerrit:1260797{{!}}SuggestedInvestigations: Import session into signal matching job (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:58 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1260797{{!}}SuggestedInvestigations: Import session into signal matching job (T421062)]]
* 20:51 eevans@cumin1003: END (ERROR) - Cookbook sre.cassandra.roll-reboot (exit_code=97) rolling reboot on P<nowiki>{</nowiki>sessionstore[1004-1006].eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 20:43 aaron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259183{{!}}Add Analytics APIs to the RestSandbox (T419429)]] (duration: 08m 33s)
* 20:38 aaron@deploy2002: aaron: Continuing with sync
* 20:36 aaron@deploy2002: aaron: Backport for [[gerrit:1259183{{!}}Add Analytics APIs to the RestSandbox (T419429)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:34 aaron@deploy2002: Started scap sync-world: Backport for [[gerrit:1259183{{!}}Add Analytics APIs to the RestSandbox (T419429)]]
* 20:30 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247073{{!}}Deploy temporary accounts to ruwiki (T413771)]] (duration: 11m 04s)
* 20:25 jdlrobson@deploy2002: stran, jdlrobson: Continuing with sync
* 20:21 jdlrobson@deploy2002: stran, jdlrobson: Backport for [[gerrit:1247073{{!}}Deploy temporary accounts to ruwiki (T413771)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:19 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1247073{{!}}Deploy temporary accounts to ruwiki (T413771)]]
* 20:17 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255775{{!}}Close the legacy-vector dblist (T421289)]] (duration: 07m 42s)
* 20:14 aokoth@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:14 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:13 aokoth@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:13 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:12 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 20:12 aokoth@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:12 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1255775{{!}}Close the legacy-vector dblist (T421289)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255775{{!}}Close the legacy-vector dblist (T421289)]]
* 20:05 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling reboot on P<nowiki>{</nowiki>hcaptcha-proxy7002.wikimedia.org<nowiki>}</nowiki> and A:hcaptcha-proxy
* 20:01 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling reboot on P<nowiki>{</nowiki>hcaptcha-proxy7002.wikimedia.org<nowiki>}</nowiki> and A:hcaptcha-proxy
* 20:00 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>sessionstore[1004-1006].eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:34 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2003.codfw.wmnet
* 19:30 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host pybal-test2003.codfw.wmnet
* 19:26 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 19:24 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>sessionstore[2004-2006].codfw.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
* 19:17 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
* 19:17 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 19:14 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned reboot
* 19:11 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 19:11 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 19:07 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 19:00 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2012.codfw.wmnet
* 18:57 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2012.codfw.wmnet
* 18:53 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>sessionstore[2004-2006].codfw.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 18:51 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
* 18:51 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
* 18:50 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 18:50 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 18:49 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
* 18:49 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
* 18:49 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
* 18:48 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/toolhub: apply
* 18:48 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/termbox: apply
* 18:47 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/termbox: apply
* 18:47 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 18:47 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 18:47 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 18:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 18:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 18:46 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: Planned reboot
* 18:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 18:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 18:45 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 18:45 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 18:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 18:44 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
* 18:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
* 18:44 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/recommendation-api: apply
* 18:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/recommendation-api: apply
* 18:43 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
* 18:43 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 18:43 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/push-notifications: apply
* 18:42 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/push-notifications: apply
* 18:42 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 18:41 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
* 18:41 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:40 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
* 18:40 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
* 18:39 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 18:39 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
* 18:37 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 18:37 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 18:37 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:37 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 18:35 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 18:34 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:34 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:33 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
* 18:29 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:28 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
* 18:28 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 18:26 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 18:26 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
* 18:26 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
* 18:25 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: debug java install
* 18:25 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:25 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases1003.eqiad.wmnet with reason: debug java install
* 18:25 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/image-suggestion: apply
* 18:24 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/image-suggestion: apply
* 18:23 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
* 18:23 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
* 18:23 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
* 18:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
* 18:22 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 18:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 18:21 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 18:21 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 18:20 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 18:20 mutante: releases1003 - apt-get upgrade - envoyproxy, python3-wmflib
* 18:20 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 18:20 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:19 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:19 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 18:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 18:18 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
* 18:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
* 18:17 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
* 18:17 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
* 18:16 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 18:16 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/echostore: apply
* 18:15 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
* 18:15 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
* 18:15 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 18:14 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 18:14 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
* 18:14 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/data-gateway: apply
* 18:14 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 18:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 18:13 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/commons-impact-analytics: apply
* 18:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/commons-impact-analytics: apply
* 18:12 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 18:12 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
* 18:11 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 18:11 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
* 18:11 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
* 18:11 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
* 18:09 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/apertium: apply
* 18:09 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/apertium: apply
* 17:29 sfaci@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:29 sfaci@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 17:23 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: debug java install
* 17:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 16:44 ebysans@deploy2002: Finished deploy [analytics/refinery@80c527b] (thin): Regular analytics weekly train THIN [analytics/refinery@80c527b6] (duration: 01m 59s)
* 16:42 ebysans@deploy2002: Started deploy [analytics/refinery@80c527b] (thin): Regular analytics weekly train THIN [analytics/refinery@80c527b6]
* 16:42 SandraEbele_: Deploying Refinery as part of weekly deployment train
* 16:41 ebysans@deploy2002: Finished deploy [analytics/refinery@80c527b]: Regular analytics weekly train [analytics/refinery@80c527b6] (duration: 04m 32s)
* 16:37 ebysans@deploy2002: Started deploy [analytics/refinery@80c527b]: Regular analytics weekly train [analytics/refinery@80c527b6]
* 16:22 ebysans@deploy2002: Finished deploy [analytics/refinery@80c527b] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@80c527b6] (duration: 01m 58s)
* 16:22 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:21 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:21 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:20 ebysans@deploy2002: Started deploy [analytics/refinery@80c527b] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@80c527b6]
* 16:20 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 16:19 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 16:18 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:18 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:06 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 16:05 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 16:05 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 16:04 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 16:03 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:02 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 16:02 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 16:01 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:51 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:50 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:42 blake@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244628{{!}}debug: reorder debug backends for eqiad switchover (T413974)]] (duration: 07m 41s)
* 15:41 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:37 blake@deploy2002: blake: Continuing with sync
* 15:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:37 blake@deploy2002: blake: Backport for [[gerrit:1244628{{!}}debug: reorder debug backends for eqiad switchover (T413974)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:34 blake@deploy2002: Started scap sync-world: Backport for [[gerrit:1244628{{!}}debug: reorder debug backends for eqiad switchover (T413974)]]
* 15:34 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:32 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-unlock-scap (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:32 root@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Datacenter switchover from codfw to eqiad - (duration: 91m 45s)
* 15:32 root@deploy2002: Forcefully removing global lock: Datacenter switchover from codfw to eqiad -
* 15:32 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-unlock-scap for datacenter switchover from codfw to eqiad
* 15:31 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:26 blake@dns1004: END - running authdns-update
* 15:24 blake@dns1004: START - running authdns-update
* 15:24 elukey@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 15:23 elukey@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 15:18 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters for datacenter switchover from codfw to eqiad
* 15:18 blake@dns1004: END - running authdns-update
* 15:16 blake@dns1004: START - running authdns-update
* 15:14 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-restore-ttl (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:13 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-restore-ttl for datacenter switchover from codfw to eqiad
* 15:11 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:10 root@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 15:09 root@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 15:08 root@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 15:07 root@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 15:07 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance for datacenter switchover from codfw to eqiad
* 15:07 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:07 root@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: sync
* 15:07 root@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: sync
* 15:07 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner for datacenter switchover from codfw to eqiad
* 15:02 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:02 blake@cumin1003: MediaWiki read-only period ends at: 2026-03-25 15:02:52.921926
* 14:55 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:53 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance for datacenter switchover from codfw to eqiad
* 14:52 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:52 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks for datacenter switchover from codfw to eqiad
* 14:51 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:46 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl for datacenter switchover from codfw to eqiad
* 14:28 cdanis: 💙cdanis@apt1002.wikimedia.org ~ 🕥☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
* 14:28 cdanis: 💙cdanis@apt1002.wikimedia.org ~ 🕥☕ sudo -i reprepro --component main --restrict cidergrinder update bullseye-wikimedia
* 14:24 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['phab2002']
* 14:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:17 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['phab2002']
* 14:14 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:13 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:13 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:12 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:11 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:08 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:07 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:07 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:06 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:06 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:06 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:05 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-lock-scap (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:00 root@deploy2002: Locking from deployment [ALL REPOSITORIES]: Datacenter switchover from codfw to eqiad -
* 14:00 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-lock-scap for datacenter switchover from codfw to eqiad
* 13:49 otto@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260091{{!}}EventStreamConfig - Increase spark_job_ingestion_scale for larger event streams (T360794 T351225)]] (duration: 07m 48s)
* 13:45 otto@deploy2002: otto: Continuing with sync
* 13:45 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:44 otto@deploy2002: otto: Backport for [[gerrit:1260091{{!}}EventStreamConfig - Increase spark_job_ingestion_scale for larger event streams (T360794 T351225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 otto@deploy2002: Started scap sync-world: Backport for [[gerrit:1260091{{!}}EventStreamConfig - Increase spark_job_ingestion_scale for larger event streams (T360794 T351225)]]
* 13:32 awight@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260614{{!}}[beta] Kill synthetic refs with feature flag (T421055)]], [[gerrit:1251193{{!}}idwiki: Remove unused user groups on Indonesian Wikipedia (T419105)]], [[gerrit:1251200{{!}}ptwiki: Enable block action for the abuse filter (T419312)]], [[gerrit:1256748{{!}}ptwiki: Add suppressredirect to autoreviewer and rollbacker user groups (T420704)]] (duration: 11m 33s)
* 13:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:27 awight@deploy2002: codenamenoreste, awight, gerrit-patch-uploader: Continuing with sync
* {{safesubst:SAL entry|1=13:23 awight@deploy2002: codenamenoreste, awight, gerrit-patch-uploader: Backport for [[gerrit:1260614{{!}}[beta] Kill synthetic refs with feature flag (T421055)]], [[gerrit:1251193{{!}}idwiki: Remove unused user groups on Indonesian Wikipedia (T419105)]], [[gerrit:1251200{{!}}ptwiki: Enable block action for the abuse filter (T419312)]], [[gerrit:1256748{{!}}ptwiki: Add suppressredirect to autoreviewer and rollbacker user groups (T420704)]}}
* 13:20 awight@deploy2002: Started scap sync-world: Backport for [[gerrit:1260614{{!}}[beta] Kill synthetic refs with feature flag (T421055)]], [[gerrit:1251193{{!}}idwiki: Remove unused user groups on Indonesian Wikipedia (T419105)]], [[gerrit:1251200{{!}}ptwiki: Enable block action for the abuse filter (T419312)]], [[gerrit:1256748{{!}}ptwiki: Add suppressredirect to autoreviewer and rollbacker user groups (T420704)]]
* 13:17 dcausse@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260045{{!}}Revert^2 "search: use the discovery ns record for the semanticsearch cluster"]] (duration: 10m 20s)
* 13:12 dcausse@deploy2002: dcausse: Continuing with sync
* 13:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:09 dcausse@deploy2002: dcausse: Backport for [[gerrit:1260045{{!}}Revert^2 "search: use the discovery ns record for the semanticsearch cluster"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:07 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:06 dcausse@deploy2002: Started scap sync-world: Backport for [[gerrit:1260045{{!}}Revert^2 "search: use the discovery ns record for the semanticsearch cluster"]]
* 13:06 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:02 XioNoX: Inter.Link - DDoS - Activation of automatic reroute
* 12:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:51 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 12:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Downgrade clouddb1022 to 10.11.15
* 12:41 marostegui@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 12:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1002.eqiad.wmnet
* 12:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-test-coord1002.eqiad.wmnet
* 12:38 mszwarc@deploy2002: mwscript-k8s job started: foreachwikiindblist all demoteIneligibleUsers.php --relay-log checkuser=metawiki --relay-log suppress=metawiki # [[phab:T418580|T418580]]
* 12:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-test-coord1002.eqiad.wmnet
* 12:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-mariadb1002.eqiad.wmnet
* 12:33 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:32 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1028.eqiad.wmnet
* 12:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host wdqs1028.eqiad.wmnet
* 12:24 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:19 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260617{{!}}Allow for demoting 2FA-less members of further 6 groups (T418580)]] (duration: 10m 23s)
* 12:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2009.codfw.wmnet
* 12:12 mszwarc@deploy2002: mszwarc: Continuing with sync
* 12:11 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1260617{{!}}Allow for demoting 2FA-less members of further 6 groups (T418580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:09 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1260617{{!}}Allow for demoting 2FA-less members of further 6 groups (T418580)]]
* 12:07 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host wdqs2009.codfw.wmnet
* 12:07 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl2002.codfw.wmnet
* 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl2002.codfw.wmnet
* 11:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl2001.codfw.wmnet
* 11:53 marostegui: Restart clouddb1022:s3 to enable error_log [[phab:T420177|T420177]]
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl2001.codfw.wmnet
* 11:51 jayme: migrated wikikube apiservers (eqiad and codfw) to IPIP - [[phab:T420436|T420436]]
* 11:49 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-master-codfw@codfw
* 11:49 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 11:48 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 11:46 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-master-eqiad@eqiad
* 11:46 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 11:45 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 11:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:43 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-master-codfw@codfw
* 11:41 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-master-eqiad@eqiad
* 11:40 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:38 mvernon@deploy2002: helmfile [codfw] DONE helmfile.d/services/kartotherian: apply
* 11:36 mvernon@deploy2002: helmfile [codfw] START helmfile.d/services/kartotherian: apply
* 11:21 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:18 mvernon@deploy2002: helmfile [codfw] DONE helmfile.d/services/kartotherian: apply
* 11:16 mvernon@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 11:15 mvernon@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 11:15 mvernon@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 11:14 mvernon@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 11:07 mvernon@deploy2002: helmfile [codfw] START helmfile.d/services/kartotherian: apply
* 11:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis abstractwiki in section s5
* 11:07 mvernon@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: apply
* 11:05 mvernon@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: apply
* 10:55 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:53 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis abstractwiki in section s5
* 10:45 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 10:33 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:27 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:26 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:21 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:20 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:20 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:01 elukey@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=kartotherian,name=codfw
* 09:58 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:57 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:52 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:52 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:51 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:51 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:46 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:45 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:45 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:44 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:05 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=aux-k8s-worker200[2-5].codfw.wmnet,cluster=aux-k8s,service=kubesvc
* 09:04 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=aux-k8s-worker200[6-9].codfw.wmnet,cluster=aux-k8s,service=kubesvc
* 09:04 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=aux-k8s-worker100[6-9].eqiad.wmnet,cluster=aux-k8s,service=kubesvc
* 08:55 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker200[6-9].eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:55 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker200[6-9].eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:35 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1009.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:35 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1008.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1007.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1006.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1009.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1008.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1007.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1006.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device fasw2-c8b-codfw
* 08:29 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device fasw2-c8b-codfw
* 08:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device fasw2-c8a-codfw
* 08:29 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device fasw2-c8a-codfw
* 08:10 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 00:33 rzl@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 00:23 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 00:23 rzl@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 00:23 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 00:22 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 00:22 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 00:22 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 00:21 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
* 00:21 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
* 00:21 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/toolhub: apply
* 00:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/termbox: apply
* 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/termbox: apply
* 00:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
* 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
* 00:19 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 00:19 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 00:18 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 00:18 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 00:18 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 00:18 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 00:17 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 00:17 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 00:17 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 00:17 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 00:17 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 00:16 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
* 00:16 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 00:16 rzl@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 00:16 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/recommendation-api: apply
* 00:15 rzl@deploy2002: helmfile [staging] START helmfile.d/services/recommendation-api: apply
* 00:15 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
* 00:15 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
* 00:15 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1023.eqiad.wmnet with OS bookworm
* 00:15 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/push-notifications: apply
* 00:15 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:15 rzl@deploy2002: helmfile [staging] START helmfile.d/services/push-notifications: apply
* 00:15 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
* 00:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:14 rzl@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
* 00:13 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
* 00:13 rzl@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
* 00:13 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 00:13 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 00:13 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 00:12 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 00:12 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 00:11 rzl@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
* 00:10 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 00:10 rzl@deploy2002: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 00:09 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
* 00:07 rzl@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
* 00:07 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 00:06 rzl@deploy2002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 00:06 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
* 00:06 rzl@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: apply
* 00:06 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
* 00:06 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1022.eqiad.wmnet with OS bookworm
* 00:06 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:06 rzl@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
* 00:06 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/image-suggestion: apply
* 00:06 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/image-suggestion: apply
* 00:05 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
* 00:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/geo-analytics: apply
* 00:05 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 00:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 00:04 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 00:04 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 00:04 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 00:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1021.eqiad.wmnet with OS bookworm
* 00:04 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:04 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 00:04 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 00:04 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:03 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 00:03 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 00:03 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 00:02 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 00:02 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 00:02 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
* 00:01 rzl@deploy2002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
* 00:01 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
* 00:01 rzl@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
* 00:00 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 00:00 rzl@deploy2002: helmfile [staging] START helmfile.d/services/echostore: apply
* 00:00 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
* 00:00 rzl@deploy2002: helmfile [staging] START helmfile.d/services/device-analytics: apply
== 2026-03-24 ==
* 23:59 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 23:59 rzl@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 23:59 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 23:59 rzl@deploy2002: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 23:58 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 23:58 rzl@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 23:58 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/commons-impact-analytics: apply
* 23:58 rzl@deploy2002: helmfile [staging] START helmfile.d/services/commons-impact-analytics: apply
* 23:56 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 23:56 rzl@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
* 23:56 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 23:56 rzl@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
* 23:54 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
* 23:53 rzl@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
* 23:53 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1023.eqiad.wmnet with reason: host reimage
* 23:53 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/apertium: apply
* 23:52 rzl@deploy2002: helmfile [staging] START helmfile.d/services/apertium: apply
* 23:46 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1022.eqiad.wmnet with reason: host reimage
* 23:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1021.eqiad.wmnet with reason: host reimage
* 23:19 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1023.eqiad.wmnet with OS bookworm
* 23:19 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1022.eqiad.wmnet with OS bookworm
* 23:18 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1021.eqiad.wmnet with OS bookworm
* 23:16 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
* 23:16 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
* 23:15 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
* 23:15 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
* 22:03 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260118{{!}}Drop inactive simple summary surveys (T389393)]] (duration: 08m 15s)
* 21:57 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:57 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1260118{{!}}Drop inactive simple summary surveys (T389393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1260118{{!}}Drop inactive simple summary surveys (T389393)]]
* 21:52 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259147{{!}}Address FIXME and drop not selector for section headings (T420085)]] (duration: 13m 11s)
* 21:47 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:44 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1259147{{!}}Address FIXME and drop not selector for section headings (T420085)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:38 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1259147{{!}}Address FIXME and drop not selector for section headings (T420085)]]
* 21:00 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=frwiki --source-pseudo-namespace=Abstract_ --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:55 jforrester@deploy2002: mwscript-k8s job started: moveBatch --wiki=frwiki '--u=Jdforrester (WMF)' --r=[[phab:T420654|T420654]] --noredirects /home/jforrester/T420654-frwiki-move # [[phab:T420654|T420654]] abstract: is now an interwiki; manual fix
* 20:55 jforrester@deploy2002: mwscript-k8s job started: moveBatch '--u=Jdforrester (WMF)' --r=[[phab:T420654|T420654]] --noredirects /home/jforrester/T420654-frwiki-move # [[phab:T420654|T420654]] abstract: is now an interwiki; manual fix
* 20:47 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=ptwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:46 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=idwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:46 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=frwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:45 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=eswiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:39 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=enwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:39 jforrester@deploy2002: mwscript-k8s job started: sql extensions/WikimediaMaintenance/maintenance/namespaceDupes.php --wiki=enwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:38 jforrester@deploy2002: mwscript-k8s job started: sql maintenance/namespaceDupes.php --wiki=enwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:38 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256433{{!}}[wikifunctions] Drop m.wikifunctions.org from lists, we've not used it for years]], [[gerrit:1250114{{!}}Move GrowthExperiments REST API definition to IS]], [[gerrit:1259993{{!}}dumpInterwiki: Re-generate to add Abstract Wikipedia (and others) (T420654)]] (duration: 07m 46s)
* 20:33 jforrester@deploy2002: jforrester: Continuing with sync
* 20:32 jforrester@deploy2002: jforrester: Backport for [[gerrit:1256433{{!}}[wikifunctions] Drop m.wikifunctions.org from lists, we've not used it for years]], [[gerrit:1250114{{!}}Move GrowthExperiments REST API definition to IS]], [[gerrit:1259993{{!}}dumpInterwiki: Re-generate to add Abstract Wikipedia (and others) (T420654)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified the
* 20:30 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1256433{{!}}[wikifunctions] Drop m.wikifunctions.org from lists, we've not used it for years]], [[gerrit:1250114{{!}}Move GrowthExperiments REST API definition to IS]], [[gerrit:1259993{{!}}dumpInterwiki: Re-generate to add Abstract Wikipedia (and others) (T420654)]]
* {{safesubst:SAL entry|1=20:27 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259967{{!}}Set json object before setting Abstract Wiki Id (T420916)]], [[gerrit:1259994{{!}}AbstractPreview: apply selected preview language lang/dir to abstract preview body (T420687)]], [[gerrit:1260092{{!}}AbstractTitle: Handle pageinfo responses without normalized titles (T420725)]], [[gerrit:1259992{{!}}[abstractwiki] Don't list abstract as a langlist entry}}
* 20:22 jforrester@deploy2002: jforrester: Continuing with sync
* 20:22 jforrester@deploy2002: jforrester: Backport for [[gerrit:1259967{{!}}Set json object before setting Abstract Wiki Id (T420916)]], [[gerrit:1259994{{!}}AbstractPreview: apply selected preview language lang/dir to abstract preview body (T420687)]], [[gerrit:1260092{{!}}AbstractTitle: Handle pageinfo responses without normalized titles (T420725)]], [[gerrit:1259992{{!}}[abstractwiki] Don't list abstract as a langlist entry (T420654)]] s
* {{safesubst:SAL entry|1=20:20 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1259967{{!}}Set json object before setting Abstract Wiki Id (T420916)]], [[gerrit:1259994{{!}}AbstractPreview: apply selected preview language lang/dir to abstract preview body (T420687)]], [[gerrit:1260092{{!}}AbstractTitle: Handle pageinfo responses without normalized titles (T420725)]], [[gerrit:1259992{{!}}[abstractwiki] Don't list abstract as a langlist entry}}
* 20:12 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1242542{{!}}Generate our own logo thumbnails rather than using MediaWiki's (T414048)]], [[gerrit:1250095{{!}}Enwikinews: Only enable flaggedRevs in article namespace (T418066)]], [[gerrit:1252684{{!}}Disable magic links on afwiki (T420142)]] (duration: 09m 22s)
* 20:08 jforrester@deploy2002: jforrester, pppery: Continuing with sync
* 20:05 jforrester@deploy2002: jforrester, pppery: Backport for [[gerrit:1242542{{!}}Generate our own logo thumbnails rather than using MediaWiki's (T414048)]], [[gerrit:1250095{{!}}Enwikinews: Only enable flaggedRevs in article namespace (T418066)]], [[gerrit:1252684{{!}}Disable magic links on afwiki (T420142)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:03 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1242542{{!}}Generate our own logo thumbnails rather than using MediaWiki's (T414048)]], [[gerrit:1250095{{!}}Enwikinews: Only enable flaggedRevs in article namespace (T418066)]], [[gerrit:1252684{{!}}Disable magic links on afwiki (T420142)]]
* 19:42 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:42 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:41 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:39 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1258300{{!}}tests: Make many things static for PHPUnit 10 (T420844)]], [[gerrit:1258301{{!}}phpunit.xml: Update configuration for PHPUnit 10 (T420844)]] (duration: 07m 21s)
* 19:35 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:35 reedy@deploy2002: reedy: Continuing with sync
* 19:34 reedy@deploy2002: reedy: Backport for [[gerrit:1258300{{!}}tests: Make many things static for PHPUnit 10 (T420844)]], [[gerrit:1258301{{!}}phpunit.xml: Update configuration for PHPUnit 10 (T420844)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:32 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1258300{{!}}tests: Make many things static for PHPUnit 10 (T420844)]], [[gerrit:1258301{{!}}phpunit.xml: Update configuration for PHPUnit 10 (T420844)]]
* 19:02 inflatador: bking@apt1002 `sudo -E reprepro -C component/opensearch2 include trixie-wikimedia ~/wmf-opensearch-search-plugins-2.19.5+3-trixie/wmf-opensearch-search-plugins_2.19.5+3_amd64.changes`
* 18:48 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 18:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1170: Degraded drive replaced [[phab:T420873|T420873]]
* 18:43 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 18:36 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 18:35 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 18:25 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:24 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:20 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 18:20 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 18:13 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 18:11 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 18:07 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on phab2002.codfw.wmnet with reason: [[phab:T420228|T420228]]
* 18:01 aokoth@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 18:01 aokoth@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 18:01 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:00 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:00 mutante: codesearch9.codesearch - systemctl restart hound_proxy ([[phab:T421147|T421147]])
* 17:34 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 17:30 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 17:20 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 17:20 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 17:20 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 17:20 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 17:00 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:00 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:00 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:00 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 16:47 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 16:38 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1113.*
* 16:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1170: Degraded drive replaced [[phab:T420873|T420873]]
* 16:24 sfaci@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:24 sfaci@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:22 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1113.eqiad.wmnet with OS trixie
* 16:05 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 16:04 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 16:03 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 16:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 16:03 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 16:03 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 16:03 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 16:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1113.eqiad.wmnet with reason: host reimage
* 15:54 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1113.eqiad.wmnet with reason: host reimage
* 15:54 bjensen: Services portion of the datacenter switchover is complete
* 15:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2009.codfw.wmnet with OS trixie
* 15:46 blake@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) depool all services in codfw: Datacenter Switchover - [[phab:T413974|T413974]]
* 15:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:38 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1113.eqiad.wmnet with OS trixie
* 15:38 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1113.eqiad.wmnet with OS trixie
* 15:36 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2009.codfw.wmnet with reason: host reimage
* 15:30 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2009.codfw.wmnet with reason: host reimage
* 15:20 blake@cumin1003: START - Cookbook sre.discovery.datacenter depool all services in codfw: Datacenter Switchover - [[phab:T413974|T413974]]
* 15:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1113.eqiad.wmnet with OS trixie
* 15:18 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2009.codfw.wmnet with OS trixie
* 15:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2008.codfw.wmnet with OS trixie
* 14:59 blake@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool codfw [reason: no reason specified, no task ID specified]
* 14:59 blake@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool codfw [reason: no reason specified, no task ID specified]
* 14:59 bjensen: beginning the Traffic and Services portions of the DC switchover, operational followup will be in #wikimedia-sre
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2008.codfw.wmnet with reason: host reimage
* 14:56 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2008.codfw.wmnet with reason: host reimage
* 14:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1009.eqiad.wmnet with OS trixie
* 14:44 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2008.codfw.wmnet with OS trixie
* 14:42 aokoth@dns1004: END - running authdns-update
* 14:41 aokoth@dns1004: START - running authdns-update
* 14:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1009.eqiad.wmnet with reason: host reimage
* 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2009.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:27 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1009.eqiad.wmnet with reason: host reimage
* 14:26 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2009.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:25 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:23 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:20 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2008.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:19 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:19 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:16 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1009.eqiad.wmnet with OS trixie
* 14:15 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2008.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:14 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 14:13 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 14:13 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 14:13 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 14:12 dcausse@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259979{{!}}Revert "search: use the discovery ns record for the semanticsearch cluster"]] (duration: 06m 54s)
* 14:08 dcausse@deploy2002: dcausse: Continuing with sync
* 14:07 dcausse@deploy2002: dcausse: Backport for [[gerrit:1259979{{!}}Revert "search: use the discovery ns record for the semanticsearch cluster"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2007.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:05 dcausse@deploy2002: Started scap sync-world: Backport for [[gerrit:1259979{{!}}Revert "search: use the discovery ns record for the semanticsearch cluster"]]
* 14:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1008.eqiad.wmnet with OS trixie
* 14:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2007.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:59 jforrester@deploy2002: mwscript-k8s job started: sql --wiki=abstractwiki /srv/mediawiki/php-1.46.0-wmf.20/extensions/Translate/sql/mysql/translate_message_group_subscriptions.sql # [[phab:T420656|T420656]] translate_message_group_subscriptions
* 13:59 dcausse@deploy2002: Sync cancelled.
* 13:57 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:52 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1008.eqiad.wmnet with reason: host reimage
* 13:46 dcausse@deploy2002: dcausse: Backport for [[gerrit:1259875{{!}}search: use the discovery ns record for the semanticsearch cluster (T414484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:44 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1008.eqiad.wmnet with reason: host reimage
* 13:44 dcausse@deploy2002: Started scap sync-world: Backport for [[gerrit:1259875{{!}}search: use the discovery ns record for the semanticsearch cluster (T414484)]]
* 13:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1008.eqiad.wmnet with OS trixie
* 13:32 sukhe: sudo cumin -b1 -s20 'C:bird' "run-puppet-agent --enable 'merging CR {{Gerrit|1248385}}, [[phab:T413740|T413740]]'"
* 13:30 cmelo@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259231{{!}}Enable the CampaignEvents extension on all wikibooks (T419597)]], [[gerrit:1259237{{!}}Enable $wgCampaignEventsEnableEventGoals in prod wikis (T414149)]] (duration: 12m 43s)
* 13:26 cmelo@deploy2002: cmelo, daimona: Continuing with sync
* 13:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1007.eqiad.wmnet with OS trixie
* 13:23 sukhe: sudo cumin 'C:bird' "disable-puppet 'merging CR {{Gerrit|1248385}}, [[phab:T413740|T413740]]'"
* 13:20 cmelo@deploy2002: cmelo, daimona: Backport for [[gerrit:1259231{{!}}Enable the CampaignEvents extension on all wikibooks (T419597)]], [[gerrit:1259237{{!}}Enable $wgCampaignEventsEnableEventGoals in prod wikis (T414149)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:18 cmelo@deploy2002: Started scap sync-world: Backport for [[gerrit:1259231{{!}}Enable the CampaignEvents extension on all wikibooks (T419597)]], [[gerrit:1259237{{!}}Enable $wgCampaignEventsEnableEventGoals in prod wikis (T414149)]]
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 13:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) payments1012.frack.eqiad.wmnet on all recursors
* 13:04 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache payments1012.frack.eqiad.wmnet on all recursors
* 13:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) payments1011.frack.eqiad.wmnet on all recursors
* 13:03 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache payments1011.frack.eqiad.wmnet on all recursors
* 13:03 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) payments1010.frack.eqiad.wmnet on all recursors
* 13:03 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache payments1010.frack.eqiad.wmnet on all recursors
* 13:02 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 13:00 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:00 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify records for payments servers frack - cmooney@cumin1003"
* 13:00 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify records for payments servers frack - cmooney@cumin1003"
* 12:56 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 12:50 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1007.eqiad.wmnet with OS trixie
* 12:02 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1017.eqiad.wmnet
* 12:02 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1017.eqiad.wmnet
* 12:01 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s1
* 11:53 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:53 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:51 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017 [[phab:T419960|T419960]]
* 11:51 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1017.eqiad.wmnet
* 11:51 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1017.eqiad.wmnet
* 11:51 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s1
* 11:49 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:49 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:46 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:46 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 11:36 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin1001.eqiad.wmnet
* 11:32 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet,service=x3
* 11:32 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet,service=x3
* 11:32 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcumin1001.eqiad.wmnet
* 11:31 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1023.eqiad.wmnet,service=x3
* 11:31 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=x3
* 11:31 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1023.eqiad.wmnet,service=x3
* 11:31 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1022.eqiad.wmnet,service=x3
* 11:27 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s3
* 11:27 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:27 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:26 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1006.eqiad.wmnet with reason: host reimage
* 11:22 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:19 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:19 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:18 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1006.eqiad.wmnet with reason: host reimage
* 11:18 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s3
* 11:17 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s3
* 11:14 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:14 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:07 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 10:55 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:55 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:53 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2007.codfw.wmnet with OS trixie
* 10:49 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2006.codfw.wmnet with OS trixie
* 10:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:36 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2007.codfw.wmnet with reason: host reimage
* 10:33 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2006.codfw.wmnet with reason: host reimage
* 10:30 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2007.codfw.wmnet with reason: host reimage
* 10:28 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2006.codfw.wmnet with reason: host reimage
* 10:22 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:21 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:20 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:20 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:17 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 10:17 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2007.codfw.wmnet with OS trixie
* 10:16 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2006.codfw.wmnet with OS trixie
* 10:07 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 09:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:46 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:43 ayounsi@cumin1003: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:34 ayounsi@cumin1003: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:31 ayounsi@cumin1003: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:31 ayounsi@cumin1003: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:29 ayounsi@cumin1003: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:29 ayounsi@cumin1003: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:23 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4008.ulsfo.wmnet with OS bookworm
* 09:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 09:01 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 08:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 08:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old ulsfo ganeti VIP - ayounsi@cumin1003"
* 08:50 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old ulsfo ganeti VIP - ayounsi@cumin1003"
* 08:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:46 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1170: Degraded drive [[phab:T420873|T420873]]
* 08:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:45 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1170: Degraded drive [[phab:T420873|T420873]]
* 08:43 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:39 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti4008.ulsfo.wmnet with OS bookworm
* 08:31 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 08:31 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:27 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:27 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:25 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:24 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:13 hashar@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 07:59 hashar: Changed https://logstash.wikimedia.org/ default page back to /app/dashboards
* 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.18 (duration: 01m 13s)
* 03:42 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.21 refs [[phab:T420479|T420479]] (duration: 39m 27s)
* 03:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 02:46 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 04s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:56 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 01:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1104.*
* 01:37 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1104.eqiad.wmnet with OS trixie
* 01:14 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1104.eqiad.wmnet with reason: host reimage
* 01:08 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1104.eqiad.wmnet with reason: host reimage
* 00:52 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1104.eqiad.wmnet with OS trixie
* 00:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 00:18 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1115.eqiad.wmnet with OS trixie
== 2026-03-23 ==
* 22:51 rzl: root@apt1002:~# reprepro --noskipold --restrict vopsbot update bookworm-wikimedia
* 22:44 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 22:28 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host an-worker1172.eqiad.wmnet
* 22:25 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1104.eqiad.wmnet with OS trixie
* 22:07 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 22:05 maryum: Deployed security fix for [[phab:T415584|T415584]]
* 21:53 maryum: Deployed security fix for [[phab:T419192|T419192]]
* 21:41 maryum: Deployed security fix for [[phab:T419168|T419168]]
* 21:35 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 21:25 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255847{{!}}testwiki: Add temporary groups for security testing]] (duration: 12m 33s)
* 21:22 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 21:21 catrope@deploy2002: catrope: Continuing with sync
* 21:18 catrope@deploy2002: catrope: Backport for [[gerrit:1255847{{!}}testwiki: Add temporary groups for security testing]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:12 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1255847{{!}}testwiki: Add temporary groups for security testing]]
* 21:05 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1106.eqiad.wmnet [reason: trixie reimaging]
* 21:05 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1106.eqiad.wmnet [reason: trixie reimaging]
* 21:05 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1104.eqiad.wmnet with OS trixie
* 21:04 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1104.eqiad.wmnet [reason: trixie reimaging]
* 21:03 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1103.eqiad.wmnet [reason: trixie reimaging]
* 20:58 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256394{{!}}Abstract Wikipedia: Fix API call to get page info (T420725)]], [[gerrit:1259085{{!}}[abstractwiki] Enable the Translate extension (T420656)]], [[gerrit:1250113{{!}}Move testwiki-only Attribution REST API definition to IS]] (duration: 11m 12s)
* 20:54 jforrester@deploy2002: jforrester: Continuing with sync
* 20:53 jforrester@deploy2002: jforrester: Backport for [[gerrit:1256394{{!}}Abstract Wikipedia: Fix API call to get page info (T420725)]], [[gerrit:1259085{{!}}[abstractwiki] Enable the Translate extension (T420656)]], [[gerrit:1250113{{!}}Move testwiki-only Attribution REST API definition to IS]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1103.eqiad.wmnet with OS trixie
* 20:50 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy4002.wikimedia.org
* 20:50 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:50 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:47 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1256394{{!}}Abstract Wikipedia: Fix API call to get page info (T420725)]], [[gerrit:1259085{{!}}[abstractwiki] Enable the Translate extension (T420656)]], [[gerrit:1250113{{!}}Move testwiki-only Attribution REST API definition to IS]]
* 20:46 sukhe@cumin1003: START - Cookbook sre.dns.netbox
* 20:45 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 20:43 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1102.eqiad.wmnet [reason: trixie reimaging]
* {{safesubst:SAL entry|1=20:42 dani@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254448{{!}}Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450{{!}}Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452{{!}}Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763{{!}}testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120{{!}}Enable wgCampaignEventsEnableEventGoals in}}
* 20:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1102.eqiad.wmnet with OS trixie
* 20:41 sukhe@cumin1003: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy4002.wikimedia.org
* 20:40 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy4001.wikimedia.org
* 20:40 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:40 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:39 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:37 dani@deploy2002: milimetric, daimona, dani: Continuing with sync
* {{safesubst:SAL entry|1=20:36 dani@deploy2002: milimetric, daimona, dani: Backport for [[gerrit:1254448{{!}}Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450{{!}}Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452{{!}}Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763{{!}}testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120{{!}}Enable wgCampaignEventsEnableEventGoals i}}
* 20:35 sukhe@cumin1003: START - Cookbook sre.dns.netbox
* {{safesubst:SAL entry|1=20:34 dani@deploy2002: Started scap sync-world: Backport for [[gerrit:1254448{{!}}Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450{{!}}Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452{{!}}Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763{{!}}testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120{{!}}Enable wgCampaignEventsEnableEventGoals in}}
* 20:31 sukhe@cumin1003: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy4001.wikimedia.org
* 20:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1103.eqiad.wmnet with reason: host reimage
* 20:24 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1103.eqiad.wmnet with reason: host reimage
* 20:23 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 20:19 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1102.eqiad.wmnet with reason: host reimage
* 20:17 alexsanford@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256472{{!}}Reduce reauth timeout for editing site JS to 10 minutes (T419605)]] (duration: 07m 32s)
* 20:14 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1102.eqiad.wmnet with reason: host reimage
* 20:13 alexsanford@deploy2002: alexsanford: Continuing with sync
* 20:11 alexsanford@deploy2002: alexsanford: Backport for [[gerrit:1256472{{!}}Reduce reauth timeout for editing site JS to 10 minutes (T419605)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 alexsanford@deploy2002: Started scap sync-world: Backport for [[gerrit:1256472{{!}}Reduce reauth timeout for editing site JS to 10 minutes (T419605)]]
* 20:08 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1103.eqiad.wmnet with OS trixie
* 20:07 alexsanford: Deployed mitigation for [[phab:T419605|T419605]]
* 19:58 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 19:58 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 19:58 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 19:58 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1102.eqiad.wmnet with OS trixie
* 19:57 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 19:54 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 19:54 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 19:54 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 19:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy4004.wikimedia.org
* 19:51 cdobbins@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1102.eqiad.wmnet with OS trixie
* 19:50 cdobbins@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1103.eqiad.wmnet with OS trixie
* 19:50 ayounsi@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy4004.wikimedia.org
* 19:47 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 19:47 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy4003.wikimedia.org
* 19:46 ayounsi@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy4003.wikimedia.org
* 19:44 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 19:44 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>aqs[1011,1014,1016-1022]*<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:42 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1103.eqiad.wmnet with OS trixie
* 19:42 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1103.eqiad.wmnet [reason: trixie reimaging]
* 19:41 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1101.eqiad.wmnet [reason: trixie reimaging]
* 19:41 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1102.eqiad.wmnet with OS trixie
* 19:41 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1102.eqiad.wmnet [reason: trixie reimaging]
* 19:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1101.eqiad.wmnet with OS trixie
* 19:39 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1100.eqiad.wmnet [reason: trixie reimaging]
* 19:37 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1100.eqiad.wmnet with OS trixie
* 19:30 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 19:18 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1101.eqiad.wmnet with reason: host reimage
* 19:14 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1100.eqiad.wmnet with reason: host reimage
* 19:13 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1101.eqiad.wmnet with reason: host reimage
* 19:13 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:13 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:10 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1100.eqiad.wmnet with reason: host reimage
* 18:59 inflatador: bking@deploy2002 restarting opensearch-semantic-search eqiad to renew certs
* 18:57 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1101.eqiad.wmnet with OS trixie
* 18:55 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1101.eqiad.wmnet with OS trixie
* 18:54 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1100.eqiad.wmnet with OS trixie
* 18:53 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1100.eqiad.wmnet with OS trixie
* 18:50 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 18:49 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 18:36 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on hcaptcha-proxy4002.wikimedia.org with reason: depooled host (soon to be decomed)
* 18:35 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on hcaptcha-proxy4001.wikimedia.org with reason: depooled host (soon to be decomed)
* 18:10 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 18:10 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 18:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>aqs[1011,1014,1016-1022]*<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 17:54 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 17:54 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1115.eqiad.wmnet with OS trixie
* 17:53 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase-eqiad
* 17:49 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259136{{!}}EventStreamConfig: Document not adding performer attributes to SI interaction v2 stream (T418740)]] (duration: 06m 28s)
* 17:45 dreamyjazz@deploy2002: kharlan, dreamyjazz: Continuing with sync
* 17:45 dreamyjazz@deploy2002: kharlan, dreamyjazz: Backport for [[gerrit:1259136{{!}}EventStreamConfig: Document not adding performer attributes to SI interaction v2 stream (T418740)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:43 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1259136{{!}}EventStreamConfig: Document not adding performer attributes to SI interaction v2 stream (T418740)]]
* 17:35 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:34 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1101.eqiad.wmnet with OS trixie
* 17:34 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1101.eqiad.wmnet [reason: trixie reimaging]
* 17:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:31 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1100.eqiad.wmnet with OS trixie
* 17:30 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1100.eqiad.wmnet [reason: trixie reimaging]
* 17:27 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:26 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:24 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:23 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:22 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:21 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 17:21 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:20 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 17:20 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:18 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:17 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:17 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:16 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 17:13 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:13 bd808@deploy2002: Finished deploy [releng/jenkins-deploy@f47af21] (releasing): jobs: Use TZ=UTC in branchMWSingleVersion.groovy trigger ([[phab:T404399|T404399]]) (duration: 01m 36s)
* 17:12 bd808@deploy2002: Started deploy [releng/jenkins-deploy@f47af21] (releasing): jobs: Use TZ=UTC in branchMWSingleVersion.groovy trigger ([[phab:T404399|T404399]])
* 17:12 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:09 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:08 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:08 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:08 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:07 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:06 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:04 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:04 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:03 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:02 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 17:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1001.eqiad.wmnet
* 16:56 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:56 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 14 hosts
* 16:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:55 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for 14 hosts
* 16:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1001.eqiad.wmnet
* 16:53 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:52 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:52 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:46 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:41 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:38 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin2001.codfw.wmnet
* 16:35 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 16:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:34 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin2001.codfw.wmnet
* 16:32 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 16:31 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:30 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:29 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1023.eqiad.wmnet
* 16:29 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1023.eqiad.wmnet
* 16:28 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:27 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:24 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1010.eqiad.wmnet
* 16:24 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1010.eqiad.wmnet
* 16:21 jgreen@dns1004: END - running authdns-update
* 16:19 jgreen@dns1004: START - running authdns-update
* 16:18 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:17 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Rebooting clouddb1023 [[phab:T419960|T419960]]
* 16:09 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1025.eqiad.wmnet
* 16:09 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1025.eqiad.wmnet
* 16:09 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1025.eqiad.wmnet
* 16:04 urandom: stopping aqs1010 for SSD replacement — [[phab:T420867|T420867]]
* 16:03 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:03 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on aqs1010.eqiad.wmnet with reason: Shutting down for SSD replacement — [[phab:T420867|T420867]]
* 15:58 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1025.eqiad.wmnet
* 15:57 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1025.eqiad.wmnet with reason: Rebooting clouddb1025 [[phab:T419960|T419960]]
* 15:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:53 topranks: disabling puppet for nftables-enabled machines to validate new ruleset on selected hosts before wider rollout [[phab:T420715|T420715]]
* 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:50 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:49 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:31 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1172.eqiad.wmnet
* 15:21 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 15:20 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 15:15 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1020.eqiad.wmnet
* 15:14 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1020.eqiad.wmnet
* 15:14 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet
* 15:05 btullis@cumin1003: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1172.eqiad.wmnet
* 15:03 btullis@cumin1003: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1172.eqiad.wmnet
* 15:03 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) opensearch-ipoid.discovery.wmnet on all recursors
* 15:03 bking@cumin2002: START - Cookbook sre.dns.wipe-cache opensearch-ipoid.discovery.wmnet on all recursors
* 15:03 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 15:01 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 15:01 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) opensearch-ipoid.discovery.wmnet on all recursors
* 14:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache opensearch-ipoid.discovery.wmnet on all recursors
* 14:58 sukhe@dns1004: END - running authdns-update
* 14:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) opensearch-test.discovery.wmnet on all recursors
* 14:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache opensearch-test.discovery.wmnet on all recursors
* 14:57 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet
* 14:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:56 sukhe@dns1004: START - running authdns-update
* 14:56 sukhe@dns1004: END - running authdns-update
* 14:56 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1020.eqiad.wmnet with reason: Rebooting clouddb1020 [[phab:T419960|T419960]]
* 14:56 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1019.eqiad.wmnet
* 14:56 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1019.eqiad.wmnet
* 14:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:55 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1019.eqiad.wmnet
* 14:55 sukhe@dns1004: START - running authdns-update
* 14:55 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase-eqiad
* 14:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:49 sukhe@dns1004: END - running authdns-update
* 14:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:48 sukhe@dns1004: START - running authdns-update
* 14:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.eqiad.wmnet
* 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 14:44 sukhe@dns1004: END - running authdns-update
* 14:43 sukhe@dns1004: START - running authdns-update
* 14:40 sukhe@dns1004: FAIL - running authdns-update
* 14:39 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.eqiad.wmnet
* 14:38 sukhe@dns1004: START - running authdns-update
* 14:37 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 14:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) k8s-ingress-dse-aa.discovery.wmnet on all recursors
* 14:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache k8s-ingress-dse-aa.discovery.wmnet on all recursors
* 14:34 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1019.eqiad.wmnet
* 14:34 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1019.eqiad.wmnet with reason: Rebooting clouddb1019 [[phab:T419960|T419960]]
* 14:33 sukhe@dns1004: FAIL - running authdns-update
* 14:33 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1002.eqiad.wmnet
* 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1018.eqiad.wmnet
* 14:33 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1018.eqiad.wmnet
* 14:32 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet
* 14:32 sukhe@dns1004: START - running authdns-update
* 14:31 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=codfw
* 14:30 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1172.eqiad.wmnet with OS bullseye
* 14:30 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:27 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1002.eqiad.wmnet
* 14:22 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1018.eqiad.wmnet with reason: Rebooting clouddb1018 [[phab:T419960|T419960]]
* 14:22 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1018.eqiad.wmnet
* 14:22 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1018.eqiad.wmnet
* 14:21 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet
* 14:20 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:17 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1001.eqiad.wmnet
* 14:14 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 14:14 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on db1253.eqiad.wmnet with reason: Under repair
* 14:11 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1001.eqiad.wmnet
* 14:07 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:04 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha2002.wikimedia.org
* 14:04 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:03 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2347-2351].codfw.wmnet
* 14:03 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2347-2351].codfw.wmnet
* 14:00 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha2002.wikimedia.org
* 14:00 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha2001.wikimedia.org
* 13:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1172.eqiad.wmnet with reason: host reimage
* 13:57 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2347-2351].codfw.wmnet
* 13:56 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha2001.wikimedia.org
* 13:56 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:55 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha1002.wikimedia.org
* 13:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1172.eqiad.wmnet with reason: host reimage
* 13:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2347-2351].codfw.wmnet
* 13:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:51 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha1002.wikimedia.org
* 13:51 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha1001.wikimedia.org
* 13:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:47 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha1001.wikimedia.org
* 13:47 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:43 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:43 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:43 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:42 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host an-worker1172.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:42 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 13:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2012.codfw.wmnet
* 13:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2011.codfw.wmnet
* 13:38 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb1002.eqiad.wmnet
* 13:36 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2012.codfw.wmnet
* 13:36 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2011.codfw.wmnet
* 13:36 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:30 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2003.codfw.wmnet
* 13:30 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1002.eqiad.wmnet
* 13:29 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudlb1002.eqiad.wmnet
* 13:29 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1002.eqiad.wmnet
* 13:28 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb1001.eqiad.wmnet
* 13:25 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:24 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2003.codfw.wmnet
* 13:21 jforrester@deploy2002: mwscript-k8s job started: extensions/WikimediaMaintenance/maintenance/createExtensionTables.php --wiki=abstractwiki translate # [[phab:T420656|T420656]]
* 13:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 13:20 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1001.eqiad.wmnet
* 13:20 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudlb1001.eqiad.wmnet
* 13:20 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1001.eqiad.wmnet
* 13:19 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:19 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudlb1001.eqiad.wmnet
* 13:18 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1001.eqiad.wmnet
* 13:17 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259035{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added (T420722)]], [[gerrit:1259036{{!}}tests: add coverage for WelcomeSurveyHooks::onCentralAuthPostLoginRedirect (T420722)]], [[gerrit:1259046{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added 2 (T420722)]] (duration: 11m 43s)
* 13:16 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:14 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:13 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:11 sgimeno@deploy2002: sgimeno: Continuing with sync
* 13:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2005-2006,2011-2018,2033-2039,2041-2042,2044,2046,2049-2051,2055-2062,2064-2065,2067-2078,2087-2095,2102-2115,2124-2179,2184-2199].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 13:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2186-2199].codfw.wmnet
* 13:08 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2186-2199].codfw.wmnet
* 13:07 sgimeno@deploy2002: sgimeno: Backport for [[gerrit:1259035{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added (T420722)]], [[gerrit:1259036{{!}}tests: add coverage for WelcomeSurveyHooks::onCentralAuthPostLoginRedirect (T420722)]], [[gerrit:1259046{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added 2 (T420722)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Ch
* 13:05 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1259035{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added (T420722)]], [[gerrit:1259036{{!}}tests: add coverage for WelcomeSurveyHooks::onCentralAuthPostLoginRedirect (T420722)]], [[gerrit:1259046{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added 2 (T420722)]]
* 12:43 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "bast4006 - ayounsi@cumin1003"
* 12:42 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "bast4006 - ayounsi@cumin1003"
* 12:42 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2006.codfw.wmnet with OS bookworm
* 12:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host bast4006.wikimedia.org
* 12:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast4006.wikimedia.org with OS bookworm
* 12:34 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:28 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
* 12:22 cmooney@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
* 12:18 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 12:16 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2186-2199].codfw.wmnet
* 12:14 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 12:08 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS bookworm
* 12:07 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2186-2199].codfw.wmnet
* 12:04 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 12:04 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 11:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 11:44 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 11:40 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:40 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:23 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2140-2153].codfw.wmnet
* 11:20 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS bookworm
* 11:20 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2140-2153].codfw.wmnet
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:19 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) bast4006.wikimedia.org on all recursors
* 11:19 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache bast4006.wikimedia.org on all recursors
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:19 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:15 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 11:15 ayounsi@cumin1003: START - Cookbook sre.ganeti.makevm for new host bast4006.wikimedia.org
* 11:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts install4003.wikimedia.org
* 11:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install4003.wikimedia.org decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
* 11:08 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install4003.wikimedia.org decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
* 11:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2140-2153].codfw.wmnet
* 11:05 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 11:00 ayounsi@cumin1003: START - Cookbook sre.hosts.decommission for hosts install4003.wikimedia.org
* 10:57 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:55 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2140-2153].codfw.wmnet
* 10:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:55 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:44 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:43 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:38 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 10:38 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:38 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:38 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 10:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:28 topranks: disable puppet on routed-ganeti hosts to test nftables update on specific nodes [[phab:T420715|T420715]]
* 10:27 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s1
* 10:25 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s1
* 10:25 ayounsi@dns1004: END - running authdns-update
* 10:24 ayounsi@dns1004: START - running authdns-update
* 10:23 btullis@cumin1003: START - Cookbook sre.hosts.provision for host an-worker1172.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 10:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:20 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s4
* 10:18 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s4
* 10:13 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s8
* 10:11 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s8
* 10:09 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1172.eqiad.wmnet with OS bullseye
* 10:08 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:05 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s7
* 10:05 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 10:04 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 10:04 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s7
* 09:58 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s3
* 09:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:57 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s3
* 09:53 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:53 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 09:52 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s2
* 09:49 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s2
* 09:49 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 09:49 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 09:49 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 09:49 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 09:44 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 09:44 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s5
* 09:44 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 09:44 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 09:42 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s5
* 09:40 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:39 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:33 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s6
* 09:32 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s6
* 09:29 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:29 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:25 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 09:25 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 09:24 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section es7
* 09:23 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section es7
* 09:22 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 09:22 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section es6
* 09:16 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section es6
* 09:11 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:11 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:10 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section x3
* 09:09 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section x3
* 09:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 09:05 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:02 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section x1
* 09:01 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section x1
* 09:00 federico3: starting [[phab:T416706|T416706]]
* 09:00 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 08:59 fceratto@cumin1003: END (FAIL) - Cookbook sre.switchdc.databases.prepare (exit_code=99) for the switch from eqiad to codfw for section test-s4
* 08:59 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from eqiad to codfw for section test-s4
* 08:59 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:59 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:50 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:46 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255736{{!}}hcaptcha: Use the global edit key for MobileFrontend edits if present (T420574)]] (duration: 14m 42s)
* 08:44 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 08:43 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 08:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:40 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:40 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:40 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:39 kharlan@deploy2002: kharlan: Continuing with sync
* 08:38 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:37 kharlan@deploy2002: kharlan: Backport for [[gerrit:1255736{{!}}hcaptcha: Use the global edit key for MobileFrontend edits if present (T420574)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 08:31 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1255736{{!}}hcaptcha: Use the global edit key for MobileFrontend edits if present (T420574)]]
* 08:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:19 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:18 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2005-2006,2011-2018,2033-2039,2041-2042,2044,2046,2049-2051,2055-2062,2064-2065,2067-2078,2087-2095,2102-2115,2124-2179,2184-2199].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 07:45 kartik@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254149{{!}}Enable ULS rewrite beta feature (T418187 T253303)]] (duration: 41m 30s)
* 07:42 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:33 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:30 kartik@deploy2002: kartik, abi: Continuing with sync
* 07:30 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:29 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:22 kartik@deploy2002: kartik, abi: Backport for [[gerrit:1254149{{!}}Enable ULS rewrite beta feature (T418187 T253303)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 07:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 07:17 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 07:16 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 07:03 kartik@deploy2002: Started scap sync-world: Backport for [[gerrit:1254149{{!}}Enable ULS rewrite beta feature (T418187 T253303)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 55s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-22 ==
* 02:50 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh7004.wikimedia.org with reason: depooled host
* 02:50 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh7003.wikimedia.org with reason: depooled host
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 21s)
* 02:01 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-20 ==
* 23:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2013.codfw.wmnet
* 23:30 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs2013.codfw.wmnet
* 22:34 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host lvs2013.codfw.wmnet
* 22:34 brett: Started pybal on lvs2013
* 22:27 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 21:57 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: trixie reimaging]
* 21:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5023.eqsin.wmnet with OS trixie
* 21:55 hashar: Upgrading CI Jenkins [[phab:T420477|T420477]]
* 21:25 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5023.eqsin.wmnet with reason: host reimage
* 21:21 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5023.eqsin.wmnet with reason: host reimage
* 21:04 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: debugging ipip
* 20:46 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5023.eqsin.wmnet with OS trixie
* 20:45 mutante: contint1003/2003 apt remove --purge apache2* ; apt remove --purge php* {{!}} [[phab:T418521|T418521]]
* 20:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 20:40 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 20:38 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5023.eqsin.wmnet with OS trixie
* 20:24 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh3006.wikimedia.org with reason: depooled host
* 20:24 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh3005.wikimedia.org with reason: depooled host
* 20:23 sukhe@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on doh3005.wikimedia.org with reason: depooled host
* 19:50 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: debugging ipip
* 19:33 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 19:30 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 19:21 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling reboot on A:tcpproxy and A:tcpproxy
* 19:16 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5023.eqsin.wmnet with OS trixie
* 19:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5023.eqsin.wmnet [reason: trixie reimaging]
* 19:16 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5021.eqsin.wmnet [reason: trixie reimaging]
* 19:14 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5021.eqsin.wmnet with OS trixie
* 18:52 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: reboot
* 18:43 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5021.eqsin.wmnet with reason: host reimage
* 18:39 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5021.eqsin.wmnet with reason: host reimage
* 18:28 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: reboot
* 18:16 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling reboot on A:tcpproxy and A:tcpproxy
* 18:14 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on db1253.eqiad.wmnet with reason: [[phab:T420041|T420041]]
* 17:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5021.eqsin.wmnet with OS trixie
* 17:54 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5021.eqsin.wmnet with OS trixie
* 17:51 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host lvs2014.codfw.wmnet
* 17:40 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on contint1003.wikimedia.org with reason: jenkins on java21
* 17:39 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet
* 16:54 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:54 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:33 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5021.eqsin.wmnet with OS trixie
* 16:32 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5021.eqsin.wmnet [reason: trixie reimaging]
* 16:09 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:08 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2002.codfw.wmnet
* 16:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2002.codfw.wmnet
* 15:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2001.codfw.wmnet
* 15:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:45 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2001.codfw.wmnet
* 15:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2041.codfw.wmnet
* 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2041.codfw.wmnet
* 15:32 cparle@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:32 cparle@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 15:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2040.codfw.wmnet
* 15:10 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2040.codfw.wmnet
* 15:02 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 15:01 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 15:00 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:59 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:58 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:58 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:57 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:56 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:55 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:50 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2002].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2002.codfw.wmnet
* 14:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2002.codfw.wmnet
* 14:44 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:44 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2002.codfw.wmnet
* 14:37 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2002.codfw.wmnet
* 14:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2001.codfw.wmnet
* 14:37 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2001.codfw.wmnet
* 14:36 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:35 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:35 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:34 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:30 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2001.codfw.wmnet
* 14:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2001.codfw.wmnet
* 14:29 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2002].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 14:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1335-1349].eqiad.wmnet
* 14:27 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1335-1349].eqiad.wmnet
* 14:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2039.codfw.wmnet
* 14:21 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2039.codfw.wmnet
* 14:16 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:16 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2038.codfw.wmnet
* 14:08 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2038.codfw.wmnet
* 13:54 jgreen@dns1004: END - running authdns-update
* 13:52 jgreen@dns1004: START - running authdns-update
* 13:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:39 inflatador: bking@deploy2002 restarting opensearch-ipoid cluster to apply new certificates
* 13:33 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 13:20 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:14 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-canary
* 13:14 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for doh[3005-3006].wikimedia.org
* 13:14 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for doh[3005-3006].wikimedia.org
* 13:08 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-canary
* 13:05 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:03 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:58 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2006.codfw.wmnet
* 12:56 cparle@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 12:55 cparle@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 12:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2006.codfw.wmnet
* 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2001.codfw.wmnet
* 12:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2001.codfw.wmnet
* 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1005.eqiad.wmnet
* 12:35 jiji@cumin1003: END (ERROR) - Cookbook sre.memcached.roll-reboot-restart (exit_code=97) rolling reboot on A:memcached-codfw
* 12:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1005.eqiad.wmnet
* 12:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw2001.wikimedia.org
* 12:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw2001.wikimedia.org
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw1001.wikimedia.org
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw1001.wikimedia.org
* 11:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 11:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 11:27 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 11:24 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 10:26 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 10:13 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:12 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:10 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:04 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:02 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:58 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:56 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:56 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:55 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:53 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:50 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:46 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:45 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:37 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:36 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:36 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:35 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:35 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:34 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:34 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:33 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:26 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:25 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:24 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:23 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:19 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:19 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:18 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:18 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:18 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:15 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:57 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 05:30 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5024.eqsin.wmnet [reason: trixie reimaging]
* 05:30 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5019.eqsin.wmnet [reason: trixie reimaging]
* 02:43 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on doh3005.wikimedia.org with reason: alerting is flapping
* 02:42 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on doh3006.wikimedia.org with reason: alerting is flapping
* 01:21 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5019.eqsin.wmnet with OS trixie
* 01:17 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5024.eqsin.wmnet with OS trixie
* 00:48 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
* 00:44 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 00:38 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
* 00:37 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 00:05 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
* 00:01 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5024.eqsin.wmnet with OS trixie
* 00:01 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
== 2026-03-19 ==
* 23:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5019.eqsin.wmnet with OS trixie
* 23:40 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255801{{!}}Make the handler follow the thumb steps (T414805)]] (duration: 06m 14s)
* 23:36 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 23:35 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1255801{{!}}Make the handler follow the thumb steps (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:33 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1255801{{!}}Make the handler follow the thumb steps (T414805)]]
* 22:48 zabe@deploy2002: mwscript-k8s job started: foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https # [[phab:T420643|T420643]]
* 22:19 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 22:18 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 22:17 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 22:16 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 22:08 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255886{{!}}Set WikiLambdaAbstractNamespaces's merge_strategy to provide_default (T420649)]] (duration: 06m 46s)
* 22:04 jforrester@deploy2002: jforrester: Continuing with sync
* 22:03 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255886{{!}}Set WikiLambdaAbstractNamespaces's merge_strategy to provide_default (T420649)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:01 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255886{{!}}Set WikiLambdaAbstractNamespaces's merge_strategy to provide_default (T420649)]]
* 21:58 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5019.eqsin.wmnet with OS trixie
* 21:57 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase-codfw
* 21:57 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5019.eqsin.wmnet [reason: trixie reimaging]
* 21:56 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
* 21:56 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5024.eqsin.wmnet [reason: trixie reimaging]
* 21:55 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255765{{!}}Implement addListener fallback for older browsers in matchMedia (T419717)]] (duration: 07m 17s)
* 21:51 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:49 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1255765{{!}}Implement addListener fallback for older browsers in matchMedia (T419717)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:48 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255765{{!}}Implement addListener fallback for older browsers in matchMedia (T419717)]]
* 21:29 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255881{{!}}Skins: Address issue with blurry images for large thumbnails (T375981)]] (duration: 07m 03s)
* 21:25 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:24 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1255881{{!}}Skins: Address issue with blurry images for large thumbnails (T375981)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:22 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255881{{!}}Skins: Address issue with blurry images for large thumbnails (T375981)]]
* 21:11 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup2020.codfw.wmnet with reason: kernel module reload
* 21:10 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 11 hosts with reason: kernel module reload
* 20:36 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254865{{!}}Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654{{!}}Deploy PRV to 13 wikis (T420273)]] (duration: 11m 00s)
* 20:32 kgraessle@deploy2002: kgraessle, arlolra: Continuing with sync
* 20:27 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1016.eqiad.wmnet
* 20:27 kgraessle@deploy2002: kgraessle, arlolra: Backport for [[gerrit:1254865{{!}}Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654{{!}}Deploy PRV to 13 wikis (T420273)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1254865{{!}}Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654{{!}}Deploy PRV to 13 wikis (T420273)]]
* 20:25 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1016.eqiad.wmnet
* 20:11 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs1016.eqiad.wmnet with reason: reboot
* 20:01 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:01 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add analytic vlan hostnames - cmooney@cumin1003"
* 20:01 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add analytic vlan hostnames - cmooney@cumin1003"
* 19:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1018.eqiad.wmnet
* 19:56 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:56 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1018.eqiad.wmnet
* 19:55 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:53 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 4.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.8.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa. on all recursors
* 19:53 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache 4.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.8.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa. on all recursors
* 19:52 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:51 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 7 hosts with reason: kernel module reload
* 19:44 topranks: disable IPv6 VRRP for et-1/0/5.1023 sub-interfaces on eqiad core routers [[phab:T405562|T405562]]
* 19:36 brett: stopping pybal/puppet on lvs1018 for reboots
* 19:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: reboots
* 19:00 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 6 hosts with reason: kernel module reload
* 19:00 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1019.eqiad.wmnet
* 19:00 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase-codfw
* 19:00 topranks: add vlan sub-interface for analytics1-d-eqiad vlan to leaf switches in eqiad row d [[phab:T405562|T405562]]
* 18:44 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs1019.eqiad.wmnet with reason: planned reboot
* 18:42 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:aqs-codfw
* 18:31 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255824{{!}}RepoBooks::onMediaWikiServices: Skip all low NSes, not just NS0 (T420617)]], [[gerrit:1255820{{!}}SpecialAbstractContent: Fix hard-coded policy list page namespace]], [[gerrit:1255794{{!}}[abstractwiki] Allow "Abstract:" as well as "Abstract Wikipedia:" as NS_PROJECT]] (duration: 06m 20s)
* 18:27 jforrester@deploy2002: jforrester: Continuing with sync
* 18:26 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255824{{!}}RepoBooks::onMediaWikiServices: Skip all low NSes, not just NS0 (T420617)]], [[gerrit:1255820{{!}}SpecialAbstractContent: Fix hard-coded policy list page namespace]], [[gerrit:1255794{{!}}[abstractwiki] Allow "Abstract:" as well as "Abstract Wikipedia:" as NS_PROJECT]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now b
* 18:24 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255824{{!}}RepoBooks::onMediaWikiServices: Skip all low NSes, not just NS0 (T420617)]], [[gerrit:1255820{{!}}SpecialAbstractContent: Fix hard-coded policy list page namespace]], [[gerrit:1255794{{!}}[abstractwiki] Allow "Abstract:" as well as "Abstract Wikipedia:" as NS_PROJECT]]
* 18:02 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
* 17:55 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 17:46 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:46 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:45 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host lvs1020.eqiad.wmnet
* 17:44 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 17:30 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 17:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy4004.wikimedia.org
* 17:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy4004.wikimedia.org with OS bookworm
* 17:26 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on contint1003.wikimedia.org with reason: jenkins on java21
* 17:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp5026.eqsin.wmnet
* 17:22 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:22 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:22 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:22 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:21 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp5026.eqsin.wmnet
* 17:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy4004.wikimedia.org with reason: host reimage
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts doh4002.wikimedia.org
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 17:08 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy4004.wikimedia.org with reason: host reimage
* 17:08 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 17:07 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:07 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:05 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp5026.eqsin.wmnet with reason: firmware updates
* 17:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 17:03 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5025.*
* 17:01 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp5025.eqsin.wmnet
* 16:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts doh4002.wikimedia.org
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts doh4001.wikimedia.org
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 16:59 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 16:58 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:57 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:57 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:56 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp5025.eqsin.wmnet
* 16:50 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts doh4001.wikimedia.org
* 16:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-master1004.eqiad.wmnet
* 16:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1151.eqiad.wmnet
* 16:45 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5026.eqsin.wmnet with OS trixie
* 16:44 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy4004.wikimedia.org with OS bookworm
* 16:44 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy4004.wikimedia.org on all recursors
* 16:43 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy4004.wikimedia.org on all recursors
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:42 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:42 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255039{{!}}Activate Abstract Wikipedia (T411723)]] (duration: 06m 09s)
* 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp5025.eqsin.wmnet with reason: firmware updates
* 16:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-master1004.eqiad.wmnet
* 16:40 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5025.eqsin.wmnet with OS trixie
* 16:39 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1151.eqiad.wmnet
* 16:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:39 jmm@cumin2002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 16:38 jforrester@deploy2002: jforrester: Continuing with sync
* 16:38 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255039{{!}}Activate Abstract Wikipedia (T411723)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:36 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255039{{!}}Activate Abstract Wikipedia (T411723)]]
* 16:35 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
* 16:33 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255779{{!}}Revert "[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki"]] (duration: 07m 19s)
* 16:29 jforrester@deploy2002: jforrester: Continuing with sync
* 16:28 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255779{{!}}Revert "[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:26 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255779{{!}}Revert "[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki"]]
* 16:25 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:24 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255773{{!}}[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki (T420531)]] (duration: 06m 06s)
* 16:23 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:aqs-codfw
* 16:21 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 16:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 16:20 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:20 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4004.wikimedia.org
* 16:20 fabfur@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=1) rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp2041*<nowiki>}</nowiki> and A:cp - 3.2 test upgrade ()
* 16:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp2041*<nowiki>}</nowiki> and A:cp - 3.2 test upgrade ()
* 16:20 jforrester@deploy2002: jforrester: Continuing with sync
* 16:19 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255773{{!}}[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki (T420531)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2003.codfw.wmnet
* 16:17 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255773{{!}}[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki (T420531)]]
* 16:17 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 16:17 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 16:17 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy4003.wikimedia.org
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy4003.wikimedia.org with OS bookworm
* 16:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1142.eqiad.wmnet
* 16:14 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk2003.codfw.wmnet
* 16:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2002.codfw.wmnet
* 16:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 16:10 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk2002.codfw.wmnet
* 16:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5025.eqsin.wmnet with reason: host reimage
* 16:08 brouberol@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:07 brouberol@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1142.eqiad.wmnet
* 16:06 brouberol@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2001.codfw.wmnet
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5025.eqsin.wmnet with reason: host reimage
* 16:05 brouberol@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk2001.codfw.wmnet
* 15:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy4003.wikimedia.org with reason: host reimage
* 15:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy4003.wikimedia.org with reason: host reimage
* 15:36 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief2002.codfw.wmnet
* 15:35 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5025.eqsin.wmnet with OS trixie
* 15:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5026.eqsin.wmnet with OS trixie
* 15:34 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5025.eqsin.wmnet with OS trixie
* 15:34 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5026.eqsin.wmnet with OS trixie
* 15:32 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief2002.codfw.wmnet
* 15:32 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief1002.eqiad.wmnet
* 15:32 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy4003.wikimedia.org with OS bookworm
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy4003.wikimedia.org on all recursors
* 15:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy4003.wikimedia.org on all recursors
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test2001.codfw.wmnet
* 15:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test1001.eqiad.wmnet
* 15:28 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief1002.eqiad.wmnet
* 15:26 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test2001.codfw.wmnet
* 15:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test1001.eqiad.wmnet
* 15:22 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255747{{!}}Hooks: Re-apply I52fc151ab88d79754baeff35d2c0f200ebe9fc9a]] (duration: 09m 55s)
* 15:22 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:22 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:22 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:21 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:21 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:21 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:18 phuedx@deploy2002: phuedx: Continuing with sync
* 15:18 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:17 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:17 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 15:16 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 15:16 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 15:14 phuedx@deploy2002: phuedx: Backport for [[gerrit:1255747{{!}}Hooks: Re-apply I52fc151ab88d79754baeff35d2c0f200ebe9fc9a]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 15:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4003.wikimedia.org
* 15:12 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1255747{{!}}Hooks: Re-apply I52fc151ab88d79754baeff35d2c0f200ebe9fc9a]]
* 15:11 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 15:10 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 15:10 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 15:09 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 15:09 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 15:09 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 15:07 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5026.eqsin.wmnet with OS trixie
* 15:06 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5025.eqsin.wmnet with OS trixie
* 15:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh4004.wikimedia.org
* 15:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh4004.wikimedia.org with OS bookworm
* 15:05 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk1003.eqiad.wmnet
* 15:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk1003.eqiad.wmnet
* 14:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk1002.eqiad.wmnet
* 14:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host matomo1003.eqiad.wmnet
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zookeeper-test1002.eqiad.wmnet
* 14:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk1002.eqiad.wmnet
* 14:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk1001.eqiad.wmnet
* 14:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host matomo1003.eqiad.wmnet
* 14:52 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host zookeeper-test1002.eqiad.wmnet
* 14:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk1001.eqiad.wmnet
* 14:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1006.eqiad.wmnet
* 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh4004.wikimedia.org with reason: host reimage
* 14:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-conf1006.eqiad.wmnet
* 14:43 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh4004.wikimedia.org with reason: host reimage
* 14:40 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 14:38 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 14:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1005.eqiad.wmnet
* 14:32 bking@cumin2002: conftool action : set/pooled=yes:weight=1; selector: name=dse-k8s-worker1010.eqiad.wmnet{{!}}dse-k8s-worker1011.eqiad.wmnet{{!}}dse-k8s-worker1012.eqiad.wmnet{{!}}dse-k8s-worker1013.eqiad.wmnet{{!}}dse-k8s-worker1015.eqiad.wmnet{{!}}dse-k8s-worker1016.eqiad.wmnet{{!}}dse-k8s-worker1017.eqiad.wmnet{{!}}dse-k8s-worker1018.eqiad.wmnet{{!}}dse-k8s-worker1019.eqiad.wmnet
* 14:29 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-conf1005.eqiad.wmnet
* 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1004.eqiad.wmnet
* 14:25 bking@cumin2002: conftool action : set/pooled=yes:weight=10; selector: name=dse-k8s-worker1012.eqiad.wmnet{{!}}dse-k8s-worker1015.eqiad.wmnet{{!}}dse-k8s-worker1016.eqiad.wmnet{{!}}dse-k8s-worker1017.eqiad.wmnet
* 14:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-conf1004.eqiad.wmnet
* 14:21 jmm@deploy2002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 14:20 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host doh4004.wikimedia.org with OS bookworm
* 14:20 jmm@deploy2002: helmfile [eqiad] START helmfile.d/services/proton: apply
* 14:19 jmm@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 14:18 jmm@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
* 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:17 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4004.wikimedia.org on all recursors
* 14:17 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4004.wikimedia.org on all recursors
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:13 jmm@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
* 14:12 jmm@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
* 14:11 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:04 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4004.wikimedia.org
* 14:03 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh4003.wikimedia.org
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh4003.wikimedia.org with OS bookworm
* 13:53 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:46 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250107{{!}}Expose new wikifunctions.v0 REST API module on Wikifunctions.org only (T419053)]] (duration: 06m 03s)
* 13:42 jforrester@deploy2002: jforrester: Continuing with sync
* 13:42 jforrester@deploy2002: jforrester: Backport for [[gerrit:1250107{{!}}Expose new wikifunctions.v0 REST API module on Wikifunctions.org only (T419053)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:40 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1250107{{!}}Expose new wikifunctions.v0 REST API module on Wikifunctions.org only (T419053)]]
* 13:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh4003.wikimedia.org with reason: host reimage
* 13:33 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh4003.wikimedia.org with reason: host reimage
* 13:22 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255686{{!}}CreateAccount: Add class to aide in instrumentation]], [[gerrit:1255685{{!}}createAccount: Log exposure and CTRs for account creation experiment (T419916)]] (duration: 12m 58s)
* 13:22 moritzm: upgrade rpki1001 to Routinator 0.15.1 [[phab:T420572|T420572]]
* 13:15 urbanecm@deploy2002: migr, urbanecm: Continuing with sync
* 13:13 urbanecm@deploy2002: migr, urbanecm: Backport for [[gerrit:1255686{{!}}CreateAccount: Add class to aide in instrumentation]], [[gerrit:1255685{{!}}createAccount: Log exposure and CTRs for account creation experiment (T419916)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:12 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host doh4003.wikimedia.org with OS bookworm
* 13:09 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1255686{{!}}CreateAccount: Add class to aide in instrumentation]], [[gerrit:1255685{{!}}createAccount: Log exposure and CTRs for account creation experiment (T419916)]]
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4003.wikimedia.org - jmm@cumin2002"
* 13:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4003.wikimedia.org - jmm@cumin2002"
* 13:07 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 17 hosts with reason: upgrade
* 13:01 moritzm: installing rsync security updates
* 12:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1017.eqiad.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm7001.magru.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:59 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 12:57 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 12:54 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 12:53 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1017.eqiad.wmnet
* 12:53 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1016.eqiad.wmnet
* 12:52 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 12:52 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 12:51 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 12:51 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 12:50 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 12:50 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 12:50 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 12:50 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 12:49 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 12:48 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 12:48 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 12:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1016.eqiad.wmnet
* 12:47 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:46 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:46 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 12:46 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:46 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 12:43 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 12:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 12:43 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb2002.codfw.wmnet
* 12:41 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 12:41 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm7001.magru.wmnet
* 12:41 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 12:41 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 12:40 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 12:40 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 12:39 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 12:39 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 12:38 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 12:37 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 12:37 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:37 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
* 12:37 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4003.wikimedia.org
* 12:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb2002.codfw.wmnet
* 12:29 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:27 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:25 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:25 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:24 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:23 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:22 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:22 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:10 urbanecm@deploy2002: mwscript-k8s job started: GrowthExperiments:reassignMentees --wiki=enwiki --mentor=Bilorv --performer=Bilorv --as-job # [[phab:T418194|T418194]]
* 11:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 11:59 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:58 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
* 11:57 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 11:53 moritzm: upgrade rpki2003 to Routinator 0.15.1 [[phab:T420572|T420572]]
* 11:46 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:40 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:40 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1017.eqiad.wmnet with reason: host reimage
* 11:26 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1017.eqiad.wmnet with reason: host reimage
* 11:18 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 11:18 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
* 11:11 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 11:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-codfw
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 10:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 10:55 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5]*<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 10:54 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host doh4003.wikimedia.org
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 10:54 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2004.codfw.wmnet
* 10:51 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 10:50 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 10:50 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:50 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2004.codfw.wmnet
* 10:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2005.codfw.wmnet
* 10:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema2004.codfw.wmnet
* 10:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2005.codfw.wmnet
* 10:43 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:43 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4003.wikimedia.org
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4007.ulsfo.wmnet to cluster ulsfo02 and group 01
* 10:42 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4007.ulsfo.wmnet to cluster ulsfo02 and group 01
* 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema2004.codfw.wmnet
* 10:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema2003.codfw.wmnet
* 10:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema2003.codfw.wmnet
* 10:37 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1002.wikimedia.org
* 10:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema1004.eqiad.wmnet
* 10:36 Raine: created temporary categorylinks_icu72 tables -- [[phab:T419980|T419980]], [[phab:T419049|T419049]]
* 10:36 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 10:34 aokoth@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:33 aokoth@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:32 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema1004.eqiad.wmnet
* 10:32 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:31 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4007.ulsfo.wmnet
* 10:29 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5]*<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 10:28 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:28 fnegri@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1002.wikimedia.org
* 10:26 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2006.codfw.wmnet
* 10:25 btullis@cumin1003: END (PASS) - Cookbook sre.opensearch.roll-restart-reboot (exit_code=0) rolling reboot on A:datahubsearch
* 10:24 btullis@cumin1003: END (FAIL) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=99) rolling reboot on A:cephosd-eqiad
* 10:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4007.ulsfo.wmnet
* 10:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2006.codfw.wmnet
* 10:21 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1001.wikimedia.org
* 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2008.wikimedia.org
* 10:19 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:18 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2008.wikimedia.org
* 10:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2007.codfw.wmnet
* 10:13 fnegri@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1001.wikimedia.org
* 10:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2007.codfw.wmnet
* 10:09 btullis@cumin1003: START - Cookbook sre.opensearch.roll-restart-reboot rolling reboot on A:datahubsearch
* 10:04 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:03 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4007.ulsfo.wmnet with OS bookworm
* 09:58 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 17 hosts with reason: upgrade
* 09:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema1003.eqiad.wmnet
* 09:53 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema1003.eqiad.wmnet
* 09:46 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]] (duration: 01m 07s)
* 09:45 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]]
* 09:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 09:43 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]] (duration: 00m 59s)
* 09:42 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]]
* 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4007.ulsfo.wmnet with reason: host reimage
* 09:35 moritzm: installing libnginx-mod-http-lua security updates
* 09:35 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4007.ulsfo.wmnet with reason: host reimage
* 09:29 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 09:26 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 09:26 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 09:24 klausman@cumin2002: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-codfw
* 09:21 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:21 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:19 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:19 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:13 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4007.ulsfo.wmnet with OS bookworm
* 09:11 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 09:01 moritzm: remove ganeti4007 from classic Ganeti cluster in ulsfo [[phab:T418993|T418993]]
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of doh4001.wikimedia.org to plain
* 08:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of doh4001.wikimedia.org to plain
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of doh4002.wikimedia.org to plain
* 08:46 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of doh4002.wikimedia.org to plain
* 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4001.wikimedia.org to plain
* 08:45 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4001.wikimedia.org to plain
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4002.wikimedia.org to plain
* 08:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4002.wikimedia.org to plain
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of install4003.wikimedia.org to plain
* 08:42 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of install4003.wikimedia.org to plain
* 08:40 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:38 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host doh4003.wikimedia.org
* 08:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 08:38 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 08:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:38 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:38 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 08:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:31 moritzm: installing python-apt security updates
* 08:29 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:29 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4003.wikimedia.org
* 08:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 08:14 moritzm: installing imagemagick security updates on Bullseye
* 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
* 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
* 08:12 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 08:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
* 07:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
* 07:17 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 07:17 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 07:16 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 07:16 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 07:14 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 07:14 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 04:53 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 00:06 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon2003.codfw.wmnet
* 00:02 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon2003.codfw.wmnet
* 00:01 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon1003.eqiad.wmnet
== 2026-03-18 ==
* 23:58 mutante: releases2003 - kill 782 (stunnel4) - systemctl start stunnel4 - fix [[phab:T420246|T420246]] [[phab:T420388|T420388]] [[phab:T420411|T420411]]
* 23:57 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon1003.eqiad.wmnet
* 23:49 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:cassandra-dev
* 23:23 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:cassandra-dev
* 23:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5017.*
* 23:02 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5020.*
* 23:01 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5028.*
* 22:40 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5028.eqsin.wmnet with OS trixie
* 22:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS trixie
* 22:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 22:04 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 21:51 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-eqiad
* 21:49 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox
* 21:49 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7002.wikimedia.org
* 21:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 21:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5027.*
* 21:40 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 21:31 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 21:30 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5028.eqsin.wmnet with OS trixie
* 21:30 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7002.wikimedia.org
* 21:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5027.eqsin.wmnet with OS trixie
* 21:27 jforrester@deploy2002: mwscript-k8s job started: extensions/WikimediaMaintenance/maintenance/addWiki.php --wiki=abstractwiki # [[phab:T411723|T411723]] addWiki.php run
* 21:26 jforrester@deploy2002: mwscript-k8s job started: extensions/WikimediaMaintenance/maintenance/addWiki.php --wiki=abstractwiki # [[phab:T411723|T411723]] addWiki.php run
* 21:24 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255034{{!}}Revert "OrchestratorRequest: Switch evaluations to v2 endpoint" (T418887)]], [[gerrit:1247650{{!}}Create Abstract Wikipedia (T411725 T411726)]] (duration: 06m 44s)
* 21:20 jforrester@deploy2002: jforrester: Continuing with sync
* 21:20 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255034{{!}}Revert "OrchestratorRequest: Switch evaluations to v2 endpoint" (T418887)]], [[gerrit:1247650{{!}}Create Abstract Wikipedia (T411725 T411726)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:17 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255034{{!}}Revert "OrchestratorRequest: Switch evaluations to v2 endpoint" (T418887)]], [[gerrit:1247650{{!}}Create Abstract Wikipedia (T411725 T411726)]]
* 21:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5017.eqsin.wmnet with OS trixie
* 21:15 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7001.wikimedia.org
* 21:12 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS trixie
* 21:08 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255013{{!}}Guard for JS null deref on empty Parsoid sections (T419721)]], [[gerrit:1254889{{!}}Reapply "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] (duration: 11m 20s)
* 21:07 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 21:07 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5028.eqsin.wmnet with OS trixie
* 21:04 jdlrobson@deploy2002: jdlrobson, harroyo-wmf: Continuing with sync
* 20:59 jdlrobson@deploy2002: jdlrobson, harroyo-wmf: Backport for [[gerrit:1255013{{!}}Guard for JS null deref on empty Parsoid sections (T419721)]], [[gerrit:1254889{{!}}Reapply "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:59 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-eqiad
* 20:58 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7001.wikimedia.org
* 20:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5027.eqsin.wmnet with reason: host reimage
* 20:57 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255013{{!}}Guard for JS null deref on empty Parsoid sections (T419721)]], [[gerrit:1254889{{!}}Reapply "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]]
* 20:52 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-codfw
* 20:51 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5027.eqsin.wmnet with reason: host reimage
* 20:51 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in1001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:50 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5020.eqsin.wmnet with OS trixie
* 20:50 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in2001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:49 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 20:48 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5028.eqsin.wmnet with OS trixie
* 20:44 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5017.eqsin.wmnet with reason: host reimage
* 20:43 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns6002.wikimedia.org
* 20:42 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out1001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:42 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1033.eqiad.wmnet with OS trixie
* 20:42 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 20:42 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 20:38 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254956{{!}}Limit legacy postprocessing cache to pages where DT does apply (T376183)]] (duration: 13m 54s)
* 20:37 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5017.eqsin.wmnet with reason: host reimage
* 20:35 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out2001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:34 cscott@deploy2002: cscott: Continuing with sync
* 20:33 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns6002.wikimedia.org
* 20:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and not P<nowiki>{</nowiki>cp2042.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 20:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
* 20:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and not P<nowiki>{</nowiki>cp2041.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 20:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
* 20:26 cscott@deploy2002: cscott: Backport for [[gerrit:1254956{{!}}Limit legacy postprocessing cache to pages where DT does apply (T376183)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1033.eqiad.wmnet with reason: host reimage
* 20:24 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1254956{{!}}Limit legacy postprocessing cache to pages where DT does apply (T376183)]]
* 20:24 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5027.eqsin.wmnet with OS trixie
* 20:22 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5029.*
* 20:21 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5029.eqsin.wmnet with OS trixie
* 20:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1033.eqiad.wmnet with reason: host reimage
* 20:18 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns6001.wikimedia.org
* 20:14 kemayo@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254965{{!}}Editcheck: fix tagging not happening for non-default checks]] (duration: 06m 28s)
* 20:10 kemayo@deploy2002: kemayo: Continuing with sync
* 20:10 kemayo@deploy2002: kemayo: Backport for [[gerrit:1254965{{!}}Editcheck: fix tagging not happening for non-default checks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1033.eqiad.wmnet with OS trixie
* 20:08 kemayo@deploy2002: Started scap sync-world: Backport for [[gerrit:1254965{{!}}Editcheck: fix tagging not happening for non-default checks]]
* 20:06 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 20:05 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns6001.wikimedia.org
* 20:05 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5017.eqsin.wmnet with OS trixie
* 20:05 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.*
* 20:05 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw
* 19:51 Reedy: running `foreachwikiindblist fishbowl.dblist extensions/OATHAuth/maintenance/UpdateSecretsToEncryptedFormat.php` [[phab:T404363|T404363]]
* 19:51 Reedy: running `foreachwikiindblist private.dblist extensions/OATHAuth/maintenance/UpdateSecretsToEncryptedFormat.php` [[phab:T404363|T404363]]
* 19:50 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns5004.wikimedia.org
* 19:50 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
* 19:50 Reedy: running `mwscript extensions/OATHAuth/maintenance/UpdateSecretsToEncryptedFormat.php --wiki=metawiki` [[phab:T404363|T404363]]
* 19:49 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
* 19:49 reedy@deploy2002: Synchronized private/PrivateSettings.php: Set $wgOATHSecretKey [[phab:T404363|T404363]] (duration: 05m 51s)
* 19:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
* 19:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5030.eqsin.wmnet with OS trixie
* 19:42 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
* 19:39 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5017.eqsin.wmnet with OS trixie
* 19:35 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 19:35 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 19:33 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns5004.wikimedia.org
* 19:30 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install4004.wikimedia.org with OS bookworm
* 19:29 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS trixie
* 19:28 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5020.eqsin.wmnet [reason: trixie reimaging]
* 19:28 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5018.eqsin.wmnet [reason: trixie reimaging]
* 19:27 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 19:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 19:26 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5018.eqsin.wmnet with OS trixie
* 19:23 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:23 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:18 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns5003.wikimedia.org
* 19:17 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:13 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:13 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install4004.wikimedia.org with reason: host reimage
* 19:11 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
* 19:08 brett@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp5029.eqsin.wmnet
* 19:08 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
* 19:08 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS trixie
* 19:08 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on install4004.wikimedia.org with reason: host reimage
* 19:02 brett@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp5029.eqsin.wmnet
* 19:01 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns5003.wikimedia.org
* 18:56 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5031.*
* 18:56 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5031.eqsin.wmnet with OS trixie
* 18:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 18:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 18:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5030.eqsin.wmnet with OS trixie
* 18:46 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns4004.wikimedia.org
* 18:45 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host install4004.wikimedia.org with OS bookworm
* 18:45 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 18:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
* 18:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
* 18:29 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
* 18:27 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns4004.wikimedia.org
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
* 18:18 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5017.eqsin.wmnet with OS trixie
* 18:17 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
* 18:17 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5017.eqsin.wmnet [reason: trixie reimaging]
* 18:17 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5018.eqsin.wmnet with OS trixie
* 18:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5018.eqsin.wmnet [reason: trixie reimaging]
* 18:16 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5017.eqsin.wmnet [reason: trixie reimaging]
* 18:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5017.eqsin.wmnet [reason: trixie reimaging]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3077.esams.wmnet [reason: trixie reimaging]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 18:12 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 18:12 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns4003.wikimedia.org
* 18:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1258: Ready
* 18:07 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 18:01 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 17:59 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3078.esams.wmnet with OS trixie
* 17:56 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3077.esams.wmnet with OS trixie
* 17:55 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns4003.wikimedia.org
* 17:54 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
* 17:51 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
* 17:43 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:42 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:40 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns3004.wikimedia.org
* 17:39 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:38 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backupmon1001.eqiad.wmnet with reason: upgrade
* 17:35 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1347.eqiad.wmnet with OS trixie
* 17:33 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 17:32 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 17:32 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5031.eqsin.wmnet with OS trixie
* 17:32 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5032.eqsin.wmnet with OS trixie
* 17:32 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5031.eqsin.wmnet with OS trixie
* 17:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 17:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 17:30 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3077.esams.wmnet with reason: host reimage
* 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:29 claime: rearmed keyholder on deploy1003
* 17:28 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:27 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 17:26 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3077.esams.wmnet with reason: host reimage
* 17:25 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:25 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1258: Ready
* 17:23 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns3004.wikimedia.org
* 17:21 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 17:21 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 17:21 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:20 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:20 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-esams and A:ncredir
* 17:19 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1347.eqiad.wmnet with reason: host reimage
* 17:18 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-drmrs and A:ncredir
* 17:16 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-eqiad and A:ncredir
* 17:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-ulsfo and A:ncredir
* 17:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
* 17:15 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 17:14 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1347.eqiad.wmnet with reason: host reimage
* 17:14 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5031.eqsin.wmnet with OS trixie
* 17:12 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 17:12 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
* 17:11 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 17:09 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 17:09 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3078.*
* 17:08 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 17:08 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 17:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3079.*
* 17:08 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns3003.wikimedia.org
* 17:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3078.*
* 17:07 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-eqiad and A:ncredir
* 17:07 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
* 17:07 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-esams and A:ncredir
* 17:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
* 17:06 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir2002.*
* 17:05 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-drmrs and A:ncredir
* 17:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir2002.codfw.wmnet
* 17:05 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-eqsin and A:ncredir
* 17:05 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-ulsfo and A:ncredir
* 17:04 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-magru and A:ncredir
* 17:03 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3078.esams.wmnet with OS trixie
* 17:02 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1347
* 17:02 jayme@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1347
* 17:02 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 17:01 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3076.esams.wmnet [reason: trixie reimaging]
* 17:01 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3077.esams.wmnet with OS trixie
* 17:01 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3077.esams.wmnet [reason: trixie reimaging]
* 16:59 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncredir2002.codfw.wmnet
* 16:58 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir2002.*
* 16:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 8 hosts with reason: upgrade
* 16:55 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir2001.*
* 16:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ncredir2001.codfw.wmnet
* 16:55 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for ncredir2001.codfw.wmnet
* 16:55 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3076.esams.wmnet with OS trixie
* 16:53 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2014.codfw.wmnet
* 16:52 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-eqsin and A:ncredir
* 16:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2008.codfw.wmnet with reason: kernel update
* 16:51 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir
* 16:51 klausman@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on ml-serve1013.eqiad.wmnet with reason: Reboot for security update
* 16:50 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2013.codfw.wmnet
* 16:49 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir2001.*
* 16:49 brett@cumin2002: END (ERROR) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=97) rolling reboot on A:ncredir and A:ncredir
* 16:48 jayme@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1347
* 16:48 jayme@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1347.eqiad.wmnet 199.48.64.10.in-addr.arpa 9.9.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 16:48 jayme@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1347.eqiad.wmnet 199.48.64.10.in-addr.arpa 9.9.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 16:48 jayme@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:47 jayme@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1347 - jayme@cumin1003"
* 16:47 jayme@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1347 - jayme@cumin1003"
* 16:47 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns3003.wikimedia.org
* 16:47 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1012.eqiad.wmnet
* 16:47 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and A:ncredir
* 16:47 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2012.codfw.wmnet
* 16:47 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2014.codfw.wmnet
* 16:46 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3075.esams.wmnet [reason: trixie reimaging]
* 16:46 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2003.codfw.wmnet
* 16:45 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
* 16:44 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2013.codfw.wmnet
* 16:44 jayme@cumin1003: START - Cookbook sre.dns.netbox
* 16:43 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2009.codfw.wmnet
* 16:43 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1347
* 16:43 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
* 16:43 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1347.eqiad.wmnet with OS trixie
* 16:43 brett@cumin2002: START - Cookbook sre.hosts.reboot-cluster
* 16:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2007.codfw.wmnet with reason: kernel update
* 16:40 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2012.codfw.wmnet
* 16:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3079.esams.wmnet with OS trixie
* 16:39 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2008.codfw.wmnet
* 16:38 moritzm: installing PHP 8.2 security updates
* 16:37 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2009.codfw.wmnet
* 16:36 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
* 16:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3078.esams.wmnet with OS trixie
* 16:34 moritzm: installing alsa-lib security updates
* 16:33 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
* 16:32 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2008.codfw.wmnet
* 16:32 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns2006.wikimedia.org
* 16:29 moritzm: failover Ganeti master in eqiad to ganeti1046
* 16:29 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3076.esams.wmnet with reason: host reimage
* 16:29 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2003.codfw.wmnet
* 16:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
* 16:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
* 16:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2005.codfw.wmnet with reason: kernel update
* 16:24 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3076.esams.wmnet with reason: host reimage
* 16:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
* 16:22 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1012.eqiad.wmnet
* 16:20 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1013.eqiad.wmnet
* 16:19 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1011.eqiad.wmnet
* 16:18 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
* 16:18 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 16:16 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host install4004.wikimedia.org with OS bookworm
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
* 16:14 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns2006.wikimedia.org
* 16:14 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1013.eqiad.wmnet
* 16:14 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1009.eqiad.wmnet
* 16:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3079.esams.wmnet with reason: host reimage
* 16:13 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1011.eqiad.wmnet
* 16:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1029.eqiad.wmnet with reason: kernel update
* 16:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1016.eqiad.wmnet with OS bookworm
* 16:12 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 16:11 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 16:11 moritzm: powercycling ganeti1053 (stuck on reboot)
* 16:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 16:09 klausman@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad
* 16:09 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 16:08 klausman@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad
* 16:07 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1009.eqiad.wmnet
* 16:07 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1003.eqiad.wmnet
* 16:06 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3079.esams.wmnet with reason: host reimage
* 16:06 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 16:06 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 16:04 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 16:04 klausman@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad
* 16:04 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 16:02 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 16:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1028.eqiad.wmnet with reason: kernel update
* 16:00 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1003.eqiad.wmnet
* 16:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1015.eqiad.wmnet
* 16:00 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3075.esams.wmnet with OS trixie
* 16:00 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3076.esams.wmnet with OS trixie
* 15:59 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns2005.wikimedia.org
* 15:58 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3076.esams.wmnet [reason: trixie reimaging]
* 15:58 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1012.eqiad.wmnet
* 15:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
* 15:58 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 15:57 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 15:57 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1010.eqiad.wmnet
* 15:57 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1008.eqiad.wmnet
* 15:57 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3074.esams.wmnet [reason: trixie reimaging]
* 15:56 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 15:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
* 15:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
* 15:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1017.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1023.eqiad.wmnet with reason: kernel update
* 15:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1015.eqiad.wmnet
* 15:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy1022.eqiad.wmnet
* 15:54 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1008.eqiad.wmnet
* 15:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy1022.eqiad.wmnet
* 15:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
* 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
* 15:52 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1010.eqiad.wmnet
* 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
* 15:51 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1012.eqiad.wmnet
* 15:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3074.esams.wmnet with OS trixie
* 15:49 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and not P<nowiki>{</nowiki>cp2042.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 15:48 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1014.eqiad.wmnet
* 15:48 klausman@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on A:ml-serve-worker-eqiad
* 15:47 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy1003.eqiad.wmnet
* 15:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
* 15:46 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and not P<nowiki>{</nowiki>cp2041.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 15:45 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1017.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:45 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns2005.wikimedia.org
* 15:42 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1014.eqiad.wmnet
* 15:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
* 15:41 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3079.esams.wmnet with OS trixie
* 15:41 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3078.esams.wmnet with OS trixie
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
* 15:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1022.eqiad.wmnet with reason: kernel update
* 15:38 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Update for dse-k8s-worker1016 - btullis@cumin1003"
* 15:37 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Update for dse-k8s-worker1016 - btullis@cumin1003"
* 15:37 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy1003.eqiad.wmnet
* 15:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1027.eqiad.wmnet
* 15:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1016.eqiad.wmnet with reason: host reimage
* 15:35 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-eqiad
* 15:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
* 15:34 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3075.esams.wmnet with reason: host reimage
* 15:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1372.eqiad.wmnet
* 15:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1371.eqiad.wmnet
* 15:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1370.eqiad.wmnet
* 15:31 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1027.eqiad.wmnet
* 15:30 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns2004.wikimedia.org
* 15:29 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1016.eqiad.wmnet with reason: host reimage
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1369.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1368.eqiad.wmnet
* 15:28 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1372.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1367.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1366.eqiad.wmnet
* 15:28 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1371.eqiad.wmnet
* 15:28 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1370.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1365.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1364.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1363.eqiad.wmnet
* 15:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1362.eqiad.wmnet
* 15:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1361.eqiad.wmnet
* 15:27 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1017
* 15:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1360.eqiad.wmnet
* 15:26 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1017
* 15:25 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3074.esams.wmnet with reason: host reimage
* 15:25 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install4004.wikimedia.org with OS bookworm
* 15:25 sukhe@dns1004: END - running authdns-update
* 15:24 sukhe@dns1004: START - running authdns-update
* 15:24 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install4004.wikimedia.org
* 15:24 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host install4004.wikimedia.org with OS bookworm
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1369.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1368.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1367.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1366.eqiad.wmnet
* 15:23 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1365.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1364.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1363.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1362.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1361.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1360.eqiad.wmnet
* 15:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1349.eqiad.wmnet
* 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
* 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
* 15:18 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3075.esams.wmnet with reason: host reimage
* 15:18 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3074.esams.wmnet with reason: host reimage
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1348.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1346.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1344.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1345.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1343.eqiad.wmnet
* 15:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1342.eqiad.wmnet
* 15:16 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns2004.wikimedia.org
* 15:15 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1349.eqiad.wmnet
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1341.eqiad.wmnet
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1340.eqiad.wmnet
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1339.eqiad.wmnet
* 15:15 moritzm: imported jenkins 2.541.3 for bullseye/bookworm/trixie
* 15:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1338.eqiad.wmnet
* 15:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1016.eqiad.wmnet with OS bookworm
* 15:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1348.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1346.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1336.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1337.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1345.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1344.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1334.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1335.eqiad.wmnet
* 15:11 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1343.eqiad.wmnet
* 15:11 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1342.eqiad.wmnet
* 15:11 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1332.eqiad.wmnet
* 15:11 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1333.eqiad.wmnet
* 15:11 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 15:11 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 15:10 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1341.eqiad.wmnet
* 15:10 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1340.eqiad.wmnet
* 15:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1331.eqiad.wmnet
* 15:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1330.eqiad.wmnet
* 15:10 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1339.eqiad.wmnet
* 15:09 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1329.eqiad.wmnet
* 15:09 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1338.eqiad.wmnet
* 15:09 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1328.eqiad.wmnet
* 15:09 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
* 15:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
* 15:07 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1337.eqiad.wmnet
* 15:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
* 15:07 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1336.eqiad.wmnet
* 15:07 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1335.eqiad.wmnet
* 15:06 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1334.eqiad.wmnet
* 15:06 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1333.eqiad.wmnet
* 15:06 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1332.eqiad.wmnet
* 15:05 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1331.eqiad.wmnet
* 15:05 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1330.eqiad.wmnet
* 15:04 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1329.eqiad.wmnet
* 15:04 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1328.eqiad.wmnet
* 15:03 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 15:02 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1033.eqiad.wmnet with OS trixie
* 15:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
* 15:01 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns1006.wikimedia.org
* 14:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum4002.ulsfo.wmnet
* 14:58 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:57 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 14:54 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3075.esams.wmnet with OS trixie
* 14:54 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3075.esams.wmnet [reason: trixie reimaging]
* 14:54 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3074.esams.wmnet with OS trixie
* 14:53 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3074.esams.wmnet [reason: trixie reimaging]
* 14:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
* 14:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
* 14:50 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:49 slyngshede@dns1004: END - running authdns-update
* 14:48 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:48 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:48 slyngshede@dns1004: START - running authdns-update
* 14:47 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns1006.wikimedia.org
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum4002.ulsfo.wmnet
* 14:45 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-eqiad
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum4001.ulsfo.wmnet
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:44 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
* 14:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
* 14:40 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:38 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
* 14:36 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum4001.ulsfo.wmnet
* 14:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install4004.wikimedia.org with OS bookworm
* 14:33 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
* 14:32 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns1005.wikimedia.org
* 14:32 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "inline pattern and pattern equivalence - oblivian@cumin1003"
* 14:32 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: inline pattern and pattern equivalence - oblivian@cumin1003
* 14:31 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: inline pattern and pattern equivalence - oblivian@cumin1003
* 14:31 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "inline pattern and pattern equivalence - oblivian@cumin1003"
* 14:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
* 14:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install4004.wikimedia.org - jmm@cumin2002"
* 14:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install4004.wikimedia.org - jmm@cumin2002"
* 14:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install4004.wikimedia.org on all recursors
* 14:24 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install4004.wikimedia.org on all recursors
* 14:21 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install4004.wikimedia.org on all recursors
* 14:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install4004.wikimedia.org on all recursors
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install4004.wikimedia.org - jmm@cumin2002"
* 14:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install4004.wikimedia.org - jmm@cumin2002"
* 14:19 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns1005.wikimedia.org
* 14:17 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254911{{!}}Restore quotation-marks in ext.wikilambda.app messages (T420456)]] (duration: 06m 32s)
* 14:17 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:17 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS bookworm
* 14:17 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:16 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:16 apine@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install4004.wikimedia.org
* 14:15 apine@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:15 apine@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:14 apine@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:14 apine@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:14 jforrester@deploy2002: jforrester: Continuing with sync
* 14:13 apine@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:13 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:13 jforrester@deploy2002: jforrester: Backport for [[gerrit:1254911{{!}}Restore quotation-marks in ext.wikilambda.app messages (T420456)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:13 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:11 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1254911{{!}}Restore quotation-marks in ext.wikilambda.app messages (T420456)]]
* 14:08 apine@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 apine@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 apine@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:06 apine@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:05 XioNoX: set graceful-shutdown on EdgeUno transit sessions
* 14:05 apine@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:04 apine@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:04 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns1004.wikimedia.org
* 14:02 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
* 14:01 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 14:01 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 13:57 Msz2001: UTC afternoon backport+config window done
* 13:56 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254876{{!}}Tweak configuration of external link aggregate usage analysis (T419837)]] (duration: 06m 41s)
* 13:55 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:53 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 13:52 mszwarc@deploy2002: mszwarc: Continuing with sync
* 13:51 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1254876{{!}}Tweak configuration of external link aggregate usage analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:50 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns1004.wikimedia.org
* 13:50 sukhe@cumin1003: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox
* 13:49 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1254876{{!}}Tweak configuration of external link aggregate usage analysis (T419837)]]
* 13:49 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254916{{!}}Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917{{!}}Normalize external domain names in click analysis (T419837)]] (duration: 07m 23s)
* 13:46 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 13:45 mszwarc@deploy2002: mszwarc: Continuing with sync
* 13:43 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1254916{{!}}Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917{{!}}Normalize external domain names in click analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2096.codfw.wmnet
* 13:41 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1033.eqiad.wmnet with OS trixie
* 13:41 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS bookworm
* 13:41 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1254916{{!}}Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917{{!}}Normalize external domain names in click analysis (T419837)]]
* 13:40 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248095{{!}}filebackend: Remove outdated comment]], [[gerrit:1254216{{!}}GrowthExperiments: increase edit and thanks query limit II (T341599)]] (duration: 08m 47s)
* 13:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
* 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
* 13:39 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 13:39 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet
* 13:37 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2096.codfw.wmnet
* 13:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2095.codfw.wmnet
* 13:36 sgimeno@deploy2002: matmarex, sgimeno: Continuing with sync
* 13:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
* 13:33 sgimeno@deploy2002: matmarex, sgimeno: Backport for [[gerrit:1248095{{!}}filebackend: Remove outdated comment]], [[gerrit:1254216{{!}}GrowthExperiments: increase edit and thanks query limit II (T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:31 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet
* 13:31 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1095.eqiad.wmnet
* 13:31 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1248095{{!}}filebackend: Remove outdated comment]], [[gerrit:1254216{{!}}GrowthExperiments: increase edit and thanks query limit II (T341599)]]
* 13:30 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2095.codfw.wmnet
* 13:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2094.codfw.wmnet
* {{safesubst:SAL entry|1=13:28 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254894{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254895{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254891{{!}}Revert "SpecialPreferences: Use Language Select Widget in language field" (T419895)]], [[gerrit:1254890{{!}}Revert "SpecialPreferences: Use Language Select Widget in lan}}
* 13:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 13:28 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 13:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1026.eqiad.wmnet
* 13:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
* 13:26 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 13:25 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1095.eqiad.wmnet
* 13:25 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1094.eqiad.wmnet
* 13:24 sgimeno@deploy2002: somerandomdeveloper, sgimeno: Continuing with sync
* {{safesubst:SAL entry|1=13:24 sgimeno@deploy2002: somerandomdeveloper, sgimeno: Backport for [[gerrit:1254894{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254895{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254891{{!}}Revert "SpecialPreferences: Use Language Select Widget in language field" (T419895)]], [[gerrit:1254890{{!}}Revert "SpecialPreferences: Use Language Select Widget in}}
* 13:23 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2094.codfw.wmnet
* 13:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2093.codfw.wmnet
* {{safesubst:SAL entry|1=13:22 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1254894{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254895{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254891{{!}}Revert "SpecialPreferences: Use Language Select Widget in language field" (T419895)]], [[gerrit:1254890{{!}}Revert "SpecialPreferences: Use Language Select Widget in lang}}
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
* 13:21 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1026.eqiad.wmnet
* 13:20 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:16 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2093.codfw.wmnet
* 13:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2092.codfw.wmnet
* 13:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
* 13:16 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1094.eqiad.wmnet
* 13:16 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1093.eqiad.wmnet
* 13:15 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:15 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:15 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
* 13:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:10 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:09 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1093.eqiad.wmnet
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
* 13:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2092.codfw.wmnet
* 13:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1027.eqiad.wmnet with reason: host reimage
* 13:08 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1016
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:06 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1016
* 13:06 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:04 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1027.eqiad.wmnet with reason: host reimage
* 13:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
* 13:02 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1012.eqiad.wmnet
* 12:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet with OS bookworm
* 12:58 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 12:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
* 12:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1015.eqiad.wmnet
* 12:55 ayounsi@dns1004: END - running authdns-update
* 12:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1012.eqiad.wmnet
* 12:54 ayounsi@dns1004: START - running authdns-update
* 12:53 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1012.eqiad.wmnet
* 12:53 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 12:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1015.eqiad.wmnet
* 12:50 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 12:50 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
* 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
* 12:47 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-codfw
* 12:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
* 12:42 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-jumbo-eqiad
* 12:38 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:37 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:37 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:36 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 12:35 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1372].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 12:33 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:32 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:31 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:30 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1020.eqiad.wmnet with reason: host reimage
* 12:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
* 12:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
* 12:25 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1020.eqiad.wmnet with reason: host reimage
* 12:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
* 12:25 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 12:25 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 12:25 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:24 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Update for dse-k8s-worker1015 - btullis@cumin1003"
* 12:24 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Update for dse-k8s-worker1015 - btullis@cumin1003"
* 12:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 12:22 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:21 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
* 12:19 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:19 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host serpens.wikimedia.org
* 12:13 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254851{{!}}Enable autodemotion for 2FA-less CN admins and WMF T&S (T418580)]] (duration: 06m 21s)
* 12:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host serpens.wikimedia.org
* 12:10 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
* 12:10 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
* 12:09 mszwarc@deploy2002: mszwarc: Continuing with sync
* 12:09 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1254851{{!}}Enable autodemotion for 2FA-less CN admins and WMF T&S (T418580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:07 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1254851{{!}}Enable autodemotion for 2FA-less CN admins and WMF T&S (T418580)]]
* 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
* 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
* 12:05 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:05 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 12:04 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:03 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:02 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254883{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254884{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]], [[gerrit:1254882{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254881{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]] (duration: 06m 48s)
* 12:02 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1026.eqiad.wmnet with reason: host reimage
* 12:02 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
* 12:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
* 12:01 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
* 11:59 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1026.eqiad.wmnet with reason: host reimage
* 11:58 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 11:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1012.eqiad.wmnet
* 11:57 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1254883{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254884{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]], [[gerrit:1254882{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254881{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]] synced to the testservers (see https://wikitech.wikimedia.
* 11:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 11:56 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1372].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 11:56 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-codfw
* 11:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:55 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1254883{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254884{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]], [[gerrit:1254882{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254881{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]]
* 11:54 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:54 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
* 11:50 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:50 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Updating for dse-k8s-worker1012 - btullis@cumin1003"
* 11:49 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Updating for dse-k8s-worker1012 - btullis@cumin1003"
* 11:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1015.eqiad.wmnet with reason: host reimage
* 11:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
* 11:48 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
* 11:48 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1307.eqiad.wmnet
* 11:48 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1307.eqiad.wmnet
* 11:47 claime: sudo homer lsw1-e5-eqiad* commit 'wikikube-worker1307 to active'
* 11:47 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:46 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1015.eqiad.wmnet with reason: host reimage
* 11:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 11:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 11:44 cgoubert@cumin1003: START - Cookbook sre.dns.netbox
* 11:42 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 11:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
* 11:39 cgoubert@cumin1003: START - Cookbook sre.dns.netbox
* 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
* 11:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2091.codfw.wmnet
* 11:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
* 11:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
* 11:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
* 11:36 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1347.eqiad.wmnet
* 11:34 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1020.eqiad.wmnet with OS bookworm
* 11:31 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 11:31 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2091.codfw.wmnet
* 11:30 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1092.eqiad.wmnet
* 11:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2090.codfw.wmnet
* 11:30 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1347.eqiad.wmnet
* 11:30 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
* 11:30 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 11:30 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 11:29 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 11:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
* 11:29 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 11:28 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 11:28 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
* 11:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
* 11:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2090.codfw.wmnet
* 11:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2089.codfw.wmnet
* 11:23 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1092.eqiad.wmnet
* 11:23 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1091.eqiad.wmnet
* 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
* 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
* 11:20 btullis@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host dse-k8s-worker1015
* 11:20 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1015
* 11:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1091.eqiad.wmnet
* 11:18 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1090.eqiad.wmnet
* 11:18 vgutierrez@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 11:18 vgutierrez@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping2004.codfw.wmnet
* 11:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 11:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2088.codfw.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
* 11:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping2004.codfw.wmnet
* 11:13 vgutierrez@dns1004: END - running authdns-update
* 11:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
* 11:11 vgutierrez@dns1004: START - running authdns-update
* 11:11 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1090.eqiad.wmnet
* 11:11 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1089.eqiad.wmnet
* 11:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2088.codfw.wmnet
* 11:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2087.codfw.wmnet
* 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
* 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
* 11:07 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 11:05 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:05 btullis@cumin1003: END (PASS) - Cookbook sre.hadoop.reboot-workers (exit_code=0) for Hadoop test cluster
* 11:04 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1089.eqiad.wmnet
* 11:04 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:04 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1088.eqiad.wmnet
* 11:03 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2087.codfw.wmnet
* 11:03 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2086.codfw.wmnet
* 11:03 vgutierrez@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
* 11:00 vgutierrez@cumin1003: START - Cookbook sre.dns.netbox
* 10:59 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-jumbo-eqiad
* 10:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
* 10:57 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1088.eqiad.wmnet
* 10:57 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1087.eqiad.wmnet
* 10:57 fabfur@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
* 10:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2086.codfw.wmnet
* 10:56 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2085.codfw.wmnet
* 10:56 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 10:53 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1087.eqiad.wmnet
* 10:53 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1086.eqiad.wmnet
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 10:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
* 10:49 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2085.codfw.wmnet
* 10:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 10:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2084.codfw.wmnet
* 10:46 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1086.eqiad.wmnet
* 10:46 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1085.eqiad.wmnet
* 10:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 10:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
* 10:40 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2084.codfw.wmnet
* 10:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2083.codfw.wmnet
* 10:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 10:39 fabfur@cumin1003: START - Cookbook sre.dns.netbox
* 10:39 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1085.eqiad.wmnet
* 10:39 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1084.eqiad.wmnet
* 10:37 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2001.codfw.wmnet
* 10:34 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:34 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 10:32 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1084.eqiad.wmnet
* 10:32 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1083.eqiad.wmnet
* 10:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2083.codfw.wmnet
* 10:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
* 10:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2082.codfw.wmnet
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 10:32 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
* 10:31 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2002.codfw.wmnet
* 10:26 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-staging2002.codfw.wmnet
* 10:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 10:25 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1083.eqiad.wmnet
* 10:25 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1082.eqiad.wmnet
* 10:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2082.codfw.wmnet
* 10:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2081.codfw.wmnet
* 10:24 btullis@cumin1003: START - Cookbook sre.hadoop.reboot-workers for Hadoop test cluster
* 10:23 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2003.codfw.wmnet
* 10:23 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
* 10:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 10:19 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-staging2003.codfw.wmnet
* 10:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1082.eqiad.wmnet
* 10:17 vgutierrez@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: no reason specified, no task ID specified]
* 10:17 vgutierrez@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: no reason specified, no task ID specified]
* 10:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2081.codfw.wmnet
* 10:17 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2001.codfw.wmnet
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 10:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
* 10:14 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2001.codfw.wmnet
* 10:14 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2002.codfw.wmnet
* 10:13 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1081.eqiad.wmnet
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 10:11 vgutierrez@dns1004: END - running authdns-update
* 10:10 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2002.codfw.wmnet
* 10:10 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2001.codfw.wmnet
* 10:09 vgutierrez@dns1004: START - running authdns-update
* 10:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2080.codfw.wmnet
* 10:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 10:06 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1081.eqiad.wmnet
* 10:06 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1080.eqiad.wmnet
* 10:05 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2001.codfw.wmnet
* 10:05 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2002.codfw.wmnet
* 10:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 10:04 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 10:03 slyngshede@cumin1003: END (FAIL) - Cookbook sre.dns.admin (exit_code=99) DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 10:03 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 10:01 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2002.codfw.wmnet
* 10:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 23 hosts
* 10:01 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
* 10:01 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1002.eqiad.wmnet
* 10:01 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for 23 hosts
* 09:59 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1080.eqiad.wmnet
* 09:59 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1079.eqiad.wmnet
* 09:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2080.codfw.wmnet
* 09:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2079.codfw.wmnet
* 09:58 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1002.eqiad.wmnet
* 09:57 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:56 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
* 09:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 09:52 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:51 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2079.codfw.wmnet
* 09:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2078.codfw.wmnet
* 09:51 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1079.eqiad.wmnet
* 09:51 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1003.eqiad.wmnet
* 09:51 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1078.eqiad.wmnet
* 09:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
* 09:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 09:48 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1003.eqiad.wmnet
* 09:48 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1002.eqiad.wmnet
* 09:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
* 09:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
* 09:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 09:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1026.eqiad.wmnet
* 09:46 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1002.eqiad.wmnet
* 09:46 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1001.eqiad.wmnet
* 09:45 moritzm: installing postgresql-15 security updates
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart A:lvs-secondary-ulsfo and A:liberica
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) pooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2078.codfw.wmnet
* 09:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2077.codfw.wmnet
* 09:45 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin pooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
* 09:44 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.upgrade restart A:lvs-secondary-ulsfo and A:liberica
* 09:44 jayme: switched wikikube staging apiservers to IPIP and maglev in eqiad and codfw - [[phab:T352956|T352956]]
* 09:43 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1078.eqiad.wmnet
* 09:43 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1077.eqiad.wmnet
* 09:42 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1001.eqiad.wmnet
* 09:40 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-staging-master-eqiad@eqiad
* 09:40 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:40 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading A:lvs-secondary-ulsfo and A:liberica ([[phab:T418971|T418971]])
* 09:40 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading A:lvs-secondary-ulsfo and A:liberica ([[phab:T418971|T418971]])
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1026.eqiad.wmnet
* 09:39 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2077.codfw.wmnet
* 09:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2076.codfw.wmnet
* 09:37 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-codfw
* 09:37 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-staging-master-eqiad@eqiad
* 09:36 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1077.eqiad.wmnet
* 09:36 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1076.eqiad.wmnet
* 09:35 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-staging-master-codfw@codfw
* 09:35 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 09:30 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2076.codfw.wmnet
* 09:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2075.codfw.wmnet
* 09:26 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1076.eqiad.wmnet
* 09:26 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1075.eqiad.wmnet
* 09:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 09:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1025.eqiad.wmnet
* 09:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2075.codfw.wmnet
* 09:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2074.codfw.wmnet
* 09:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 09:19 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1075.eqiad.wmnet
* 09:18 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1074.eqiad.wmnet
* 09:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 09:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1025.eqiad.wmnet
* 09:15 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2074.codfw.wmnet
* 09:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2073.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-staging-master-codfw@codfw
* 09:14 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 09:13 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-codfw
* 09:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 09:12 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-eqiad
* 09:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
* 09:10 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1074.eqiad.wmnet
* 09:10 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1073.eqiad.wmnet
* 09:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2073.codfw.wmnet
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet
* 09:08 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 23 hosts with reason: Update ULSFO LVS service IPs
* 09:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 09:03 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1073.eqiad.wmnet
* 09:03 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1072.eqiad.wmnet
* 09:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet
* 09:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2071.codfw.wmnet
* 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 09:02 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping1004.eqiad.wmnet
* 09:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 08:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
* 08:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 08:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping1004.eqiad.wmnet
* 08:56 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1072.eqiad.wmnet
* 08:56 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet
* 08:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2071.codfw.wmnet
* 08:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2070.codfw.wmnet
* 08:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 08:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
* 08:48 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet
* 08:47 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2070.codfw.wmnet
* 08:46 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-eqiad
* 08:29 hashar: Restarting CI Jenkins for plugin upgrade # [[phab:T420347|T420347]]
* 08:22 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 07:45 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 32934
* 07:42 btullis@cumin1003: END (PASS) - Cookbook sre.hadoop.reboot-workers (exit_code=0) for Hadoop analytics cluster
* 07:35 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 32934
* 07:22 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
* 07:16 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
* 06:54 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
* 06:38 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
* 03:22 musikanimal@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254468{{!}}CM5: add more aggressive warnings about CM5 deprecation (T373720)]] (duration: 12m 22s)
* 03:18 musikanimal@deploy2002: musikanimal: Continuing with sync
* 03:11 musikanimal@deploy2002: musikanimal: Backport for [[gerrit:1254468{{!}}CM5: add more aggressive warnings about CM5 deprecation (T373720)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 03:09 musikanimal@deploy2002: Started scap sync-world: Backport for [[gerrit:1254468{{!}}CM5: add more aggressive warnings about CM5 deprecation (T373720)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 47s)
* 02:07 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 02:06 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:04 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:38 denisse@deploy2002: Finished deploy [librenms/librenms@9bdfb73]: Upgrade LibreNMS to 26.3.1 (duration: 00m 19s)
* 01:38 denisse@deploy2002: Started deploy [librenms/librenms@9bdfb73]: Upgrade LibreNMS to 26.3.1
* 01:10 denisse@deploy2002: Finished deploy [librenms/librenms@d152b36]: Upgrade LibreNMS to 25.11.0 (duration: 00m 08s)
* 01:10 denisse@deploy2002: Started deploy [librenms/librenms@d152b36]: Upgrade LibreNMS to 25.11.0
== 2026-03-17 ==
* 23:44 btullis@cumin1003: START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster
* 23:38 btullis@cumin1003: END (FAIL) - Cookbook sre.hadoop.reboot-workers (exit_code=99) for Hadoop analytics cluster
* 22:55 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3081.*
* 22:20 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3073.esams.wmnet [reason: trixie reimaging]
* 22:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3073.esams.wmnet with OS trixie
* 22:11 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3072.esams.wmnet [reason: trixie reimaging]
* 22:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3072.esams.wmnet with OS trixie
* 22:05 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on releases1003.eqiad.wmnet with reason: [[phab:T420246|T420246]]
* 22:05 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T420246|T420246]]
* 21:48 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3073.esams.wmnet with reason: host reimage
* 21:44 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3072.esams.wmnet with reason: host reimage
* 21:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3073.esams.wmnet with reason: host reimage
* 21:39 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3072.esams.wmnet with reason: host reimage
* 21:38 ryankemper: [[phab:T411568|T411568]] Failed back HDFS NameNode from an-master1004 to an-master1003; cluster back to original active/standby configuration
* 21:15 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3073.esams.wmnet with OS trixie
* 21:14 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3073.esams.wmnet [reason: trixie reimaging]
* 21:14 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3072.esams.wmnet with OS trixie
* 21:14 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3072.esams.wmnet [reason: trixie reimaging]
* 21:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3070.esams.wmnet [reason: trixie reimaging]
* 21:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3071.esams.wmnet [reason: trixie reimaging]
* 21:09 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3070.esams.wmnet with OS trixie
* 21:05 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3071.esams.wmnet with OS trixie
* 20:59 alexsanford@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254280{{!}}Remove notice from login form in popup mode (T418534)]] (duration: 07m 32s)
* 20:56 alexsanford@deploy2002: alexsanford: Continuing with sync
* 20:54 alexsanford@deploy2002: alexsanford: Backport for [[gerrit:1254280{{!}}Remove notice from login form in popup mode (T418534)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 alexsanford@deploy2002: Started scap sync-world: Backport for [[gerrit:1254280{{!}}Remove notice from login form in popup mode (T418534)]]
* 20:48 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:43 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:43 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3070.esams.wmnet with reason: host reimage
* 20:40 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 20:40 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 20:38 ryankemper: [[phab:T411568|T411568]] failed over HDFS NameNode from an-master1003 to an-master1004, then rebooted `an-master1003`
* 20:38 ryankemper: [[phab:T411568|T411568]] rebooted `an-coord1003`, `an-coord1004`, `an-tool1007`, `an-tool1008`, `an-tool1011`, `an-web1001`
* 20:38 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3071.esams.wmnet with reason: host reimage
* 20:34 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3070.esams.wmnet with reason: host reimage
* 20:34 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3071.esams.wmnet with reason: host reimage
* 20:31 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254301{{!}}Passwordless login: Don't display conditional auth errors]], [[gerrit:1254302{{!}}Passwordless login: Don't display conditional auth errors]] (duration: 08m 56s)
* 20:27 catrope@deploy2002: catrope: Continuing with sync
* 20:24 catrope@deploy2002: catrope: Backport for [[gerrit:1254301{{!}}Passwordless login: Don't display conditional auth errors]], [[gerrit:1254302{{!}}Passwordless login: Don't display conditional auth errors]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:22 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1254301{{!}}Passwordless login: Don't display conditional auth errors]], [[gerrit:1254302{{!}}Passwordless login: Don't display conditional auth errors]]
* 20:16 ryankemper: [[phab:T411568|T411568]] rebooted `an-test-master1002`, `an-test-master1003`, `an-test-master1004`, `archiva1002`
* 20:12 aude@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251309{{!}}Set wgReadingListsBetaDefaultForNewAccountsAfter for beta cluster (T419163)]] (duration: 08m 53s)
* 20:09 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3071.esams.wmnet with OS trixie
* 20:09 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3070.esams.wmnet with OS trixie
* 20:08 aude@deploy2002: aude: Continuing with sync
* 20:08 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3070.esams.wmnet [reason: trixie reimaging]
* 20:08 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3068.esams.wmnet [reason: trixie reimaging]
* 20:07 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3069.esams.wmnet [reason: trixie reimaging]
* 20:06 aude@deploy2002: aude: Backport for [[gerrit:1251309{{!}}Set wgReadingListsBetaDefaultForNewAccountsAfter for beta cluster (T419163)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:03 aude@deploy2002: Started scap sync-world: Backport for [[gerrit:1251309{{!}}Set wgReadingListsBetaDefaultForNewAccountsAfter for beta cluster (T419163)]]
* 19:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3081.esams.wmnet with OS trixie
* 19:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3069.esams.wmnet with OS trixie
* 19:54 ryankemper: [[phab:T411568|T411568]] rebooted `an-test-client1002`, `an-test-ui1001`, `an-test-coord1001`, `an-test-master1001`
* 19:50 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3068.esams.wmnet with OS trixie
* 19:46 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 19:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast2003.wikimedia.org with OS trixie
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3081.esams.wmnet with reason: host reimage
* 19:28 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3081.esams.wmnet with reason: host reimage
* 19:28 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases1003.eqiad.wmnet with reason: [[phab:T420246|T420246]]
* 19:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3069.esams.wmnet with reason: host reimage
* 19:23 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3068.esams.wmnet with reason: host reimage
* 19:21 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3069.esams.wmnet with reason: host reimage
* 19:20 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3068.esams.wmnet with reason: host reimage
* 19:19 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 19:11 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 19:08 dzahn@dns1004: END - running authdns-update
* 19:07 dzahn@dns1004: START - running authdns-update
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3081.esams.wmnet with OS trixie
* 19:05 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp3081.esams.wmnet with OS trixie
* 19:00 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3080.*
* 18:56 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3069.esams.wmnet with OS trixie
* 18:55 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3069.esams.wmnet [reason: trixie reimaging]
* 18:55 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3068.esams.wmnet with OS trixie
* 18:55 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid public cluster: Reboot Druid nodes
* 18:54 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3068.esams.wmnet [reason: trixie reimaging]
* 18:53 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3066.esams.wmnet [reason: trixie reimaging]
* 18:53 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3067.esams.wmnet [reason: trixie reimaging]
* 18:50 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host bast2003.wikimedia.org with OS trixie
* 18:49 swfrench-wmf: manually uncordoned wikikube-worker-exp1001.eqiad.wmnet after failed reboot
* 18:47 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3080.esams.wmnet with OS trixie
* 18:43 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3067.esams.wmnet with OS trixie
* 18:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3066.esams.wmnet with OS trixie
* 18:32 dwisehaupt@dns1005: END - running authdns-update
* 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast2003.wikimedia.org with OS bookworm
* 18:31 dwisehaupt@dns1005: START - running authdns-update
* 18:25 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 18:25 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 18:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3080.esams.wmnet with reason: host reimage
* 18:19 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[5-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:19 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7008.magru.wmnet
* 18:17 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:16 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3067.esams.wmnet with reason: host reimage
* 18:16 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 18:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3066.esams.wmnet with reason: host reimage
* 18:09 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3080.esams.wmnet with reason: host reimage
* 18:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 18:04 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3067.esams.wmnet with reason: host reimage
* 18:03 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3066.esams.wmnet with reason: host reimage
* 18:03 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 17:52 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1312-1327].eqiad.wmnet,wikikube-worker-exp1001.eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 17:52 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 17:52 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 17:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3080.esams.wmnet with OS trixie
* 17:44 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:43 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:42 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3081.esams.wmnet with OS trixie
* 17:42 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp3081.esams.wmnet with OS trixie
* 17:42 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:41 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid public cluster: Reboot Druid nodes
* 17:40 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:39 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7013.magru.wmnet,cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 17:39 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7016.magru.wmnet
* 17:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7007.magru.wmnet
* 17:37 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:33 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:31 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:29 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3067.esams.wmnet with OS trixie
* 17:29 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3067.esams.wmnet [reason: trixie reimaging]
* 17:28 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3066.esams.wmnet with OS trixie
* 17:28 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:27 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp3066.esams.wmnet with OS trixie
* 17:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3066.esams.wmnet with OS trixie
* 17:26 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:26 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3066.esams.wmnet [reason: trixie reimaging]
* 17:21 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:20 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:19 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3081.esams.wmnet with OS trixie
* 17:19 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:16 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet
* 17:16 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2069.codfw.wmnet
* 17:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 17:14 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:14 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:14 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:13 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:13 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:09 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7014.*
* 17:09 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet
* 17:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2069.codfw.wmnet
* 17:09 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:08 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1070.eqiad.wmnet
* 17:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet
* 17:07 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host bast2003.wikimedia.org with OS bookworm
* 17:06 cgoubert@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1312-1327].eqiad.wmnet,wikikube-worker-exp1001.eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 17:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['bast2003']
* 17:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2068.codfw.wmnet
* 17:02 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1070.eqiad.wmnet
* 17:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2067.codfw.wmnet
* 17:01 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1069.eqiad.wmnet
* 17:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 17:00 cgoubert@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7014.magru.wmnet with OS trixie
* 16:58 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 16:58 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 16:58 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 16:57 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7015.magru.wmnet
* 16:56 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7006.magru.wmnet
* 16:55 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1069.eqiad.wmnet
* 16:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2067.codfw.wmnet
* 16:53 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1068.eqiad.wmnet
* 16:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2066.codfw.wmnet
* 16:47 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:47 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist all cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 16:46 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 16:46 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['bast2003']
* 16:45 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1068.eqiad.wmnet
* 16:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2066.codfw.wmnet
* 16:44 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1067.eqiad.wmnet
* 16:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2065.codfw.wmnet
* 16:42 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid analytics cluster: Reboot Druid nodes
* 16:40 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 16:37 cgoubert@cumin1003: START - Cookbook sre.dns.netbox
* 16:36 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1067.eqiad.wmnet
* 16:36 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2065.codfw.wmnet
* 16:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2064.codfw.wmnet
* 16:35 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1066.eqiad.wmnet
* 16:34 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist group2 cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 16:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2003.codfw.wmnet with OS trixie
* 16:33 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7014.magru.wmnet with reason: host reimage
* 16:32 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:32 cgoubert@cumin1003: START - Cookbook sre.hosts.remove-downtime for wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:28 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7014.magru.wmnet with reason: host reimage
* 16:28 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1066.eqiad.wmnet
* 16:28 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2064.codfw.wmnet
* 16:25 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on releases2003.codfw.wmnet with reason: [[phab:T420246|T420246]]
* 16:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2063.codfw.wmnet
* 16:25 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1065.eqiad.wmnet
* 16:25 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:25 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1065.eqiad.wmnet
* 16:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2063.codfw.wmnet
* 16:18 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-jumbo-eqiad
* 16:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2062.codfw.wmnet
* 16:17 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1064.eqiad.wmnet
* 16:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7013.magru.wmnet
* 16:15 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2003.codfw.wmnet with reason: host reimage
* 16:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7005.magru.wmnet
* 16:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2062.codfw.wmnet
* 16:08 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2003.codfw.wmnet with reason: host reimage
* 16:07 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1064.eqiad.wmnet
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7014.magru.wmnet with OS trixie
* 16:05 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp7014.magru.wmnet with OS trixie
* 16:03 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7013.magru.wmnet,cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 16:03 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[5-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 15:54 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1009.eqiad.wmnet
* 15:54 mutante: zuul2003 - reimaging with trixie
* 15:52 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist group1 cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2033.codfw.wmnet
* 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2033.codfw.wmnet
* 15:46 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2003.codfw.wmnet with OS trixie
* 15:45 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1009.eqiad.wmnet
* 15:45 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1008.eqiad.wmnet
* 15:44 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist group0 cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 15:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2033.codfw.wmnet
* 15:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 15:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2048.codfw.wmnet
* 15:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2048.codfw.wmnet
* 15:36 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1008.eqiad.wmnet
* 15:36 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1007.eqiad.wmnet
* 15:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2048.codfw.wmnet
* 15:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1012.eqiad.wmnet with reason: host reimage
* 15:33 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist testwikis cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 15:32 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid analytics cluster: Reboot Druid nodes
* 15:28 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1007.eqiad.wmnet
* 15:28 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1006.eqiad.wmnet
* 15:27 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1012.eqiad.wmnet with reason: host reimage
* 15:27 samtar@deploy2002: mwscript-k8s job started: cleanupWatchlistLabelMember.php --wiki=testwiki # [[phab:T420328|T420328]]
* 15:27 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2008-dev.codfw.wmnet
* 15:25 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 15:23 jmm@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 15:22 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad
* 15:21 jmm@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 15:20 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2008-dev.codfw.wmnet
* 15:20 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1006.eqiad.wmnet
* 15:20 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1005.eqiad.wmnet
* 15:18 jmm@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 15:18 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244723{{!}}cleanup: Growth: Remove temporary GrowthMentorList overrides (T418518)]] (duration: 06m 32s)
* 15:16 jmm@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 15:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 16509
* 15:14 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2005-dev.codfw.wmnet
* 15:14 urbanecm@deploy2002: urbanecm: Continuing with sync
* 15:13 urbanecm@deploy2002: urbanecm: Backport for [[gerrit:1244723{{!}}cleanup: Growth: Remove temporary GrowthMentorList overrides (T418518)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:13 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 15:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2048.codfw.wmnet
* 15:11 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1244723{{!}}cleanup: Growth: Remove temporary GrowthMentorList overrides (T418518)]]
* 15:10 brennen@deploy2002: Finished deploy [phabricator/deployment@e845707]: deploy phab1004 for [[phab:T420366|T420366]] (duration: 01m 02s)
* 15:09 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1005.eqiad.wmnet
* 15:09 brennen@deploy2002: Started deploy [phabricator/deployment@e845707]: deploy phab1004 for [[phab:T420366|T420366]]
* 15:09 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254217{{!}}Create dblists for wikis where CheckUser and AbuseFilter are disabled (T420063 T420062)]] (duration: 06m 38s)
* 15:09 brennen@deploy2002: Finished deploy [phabricator/deployment@e845707]: deploy phab2002 for [[phab:T420366|T420366]] (duration: 00m 35s)
* 15:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudservices2005-dev.codfw.wmnet
* 15:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 15:08 brennen@deploy2002: Started deploy [phabricator/deployment@e845707]: deploy phab2002 for [[phab:T420366|T420366]]
* 15:08 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2009.codfw.wmnet
* 15:05 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 15:05 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 15:05 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 15:04 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1254217{{!}}Create dblists for wikis where CheckUser and AbuseFilter are disabled (T420063 T420062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:04 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7014.magru.wmnet with OS trixie
* 15:03 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host phab1004.eqiad.wmnet
* 15:02 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1254217{{!}}Create dblists for wikis where CheckUser and AbuseFilter are disabled (T420063 T420062)]]
* 15:02 topranks: reset BGP session to ssw1-d8-eiqad from lsw1-d4-eqiad [[phab:T420180|T420180]]
* 15:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:02 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 15:02 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2003-dev.codfw.wmnet
* 15:00 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2009.codfw.wmnet
* 15:00 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2008.codfw.wmnet
* 14:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
* 14:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
* 14:57 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host phab1004.eqiad.wmnet
* 14:55 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2003-dev.codfw.wmnet
* 14:55 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2002-dev.codfw.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum4004.ulsfo.wmnet
* 14:53 jmm@deploy2002: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 14:53 jmm@deploy2002: helmfile [staging] START helmfile.d/services/thumbor: apply
* 14:52 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2008.codfw.wmnet
* 14:52 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2007.codfw.wmnet
* 14:51 topranks: stop accepting routes on ssw1-d8-eqiad from external peers (cr2-eqiad, other spines) [[phab:T420351|T420351]]
* 14:51 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 14:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host durum4004.ulsfo.wmnet
* 14:50 topranks: stop announcing routes from ssw1-d8-eqiad to external peers (cr2-eqiad, other spines) [[phab:T420351|T420351]]
* 14:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
* 14:48 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2002-dev.codfw.wmnet
* 14:48 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2001-dev.codfw.wmnet
* 14:46 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2007.codfw.wmnet
* 14:46 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2006.codfw.wmnet
* 14:45 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 14:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:44 taavi: deploying cr firewall changes from https://gerrit.wikimedia.org/r/c/operations/homer/public/+/1254211
* 14:44 topranks: stop announcing "direct" routes to ssw1-d8-eqiad from cr2-eqiad [[phab:T420351|T420351]]
* 14:44 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2034.codfw.wmnet
* 14:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:43 moritzm: failover Ganeti master in codfw to ganeti2047
* 14:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2050.codfw.wmnet
* 14:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2050.codfw.wmnet
* 14:41 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2001-dev.codfw.wmnet
* 14:41 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2007-dev.codfw.wmnet
* 14:40 topranks: disabling EVPN IBGP peering from ssw1-d8-eqiad to ssw1-d1-eqiad to stop them reflecting routes [[phab:T420351|T420351]]
* 14:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1006.eqiad.wmnet
* 14:39 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 14:38 inflatador: bking@requestctl remove `wdqs_highest_error_rate_ever_seen` requestctl rule as it is no longer needed
* 14:38 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2006.codfw.wmnet
* 14:37 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2005.codfw.wmnet
* 14:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2050.codfw.wmnet
* 14:35 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2007-dev.codfw.wmnet
* 14:35 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2006-dev.codfw.wmnet
* 14:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1006.eqiad.wmnet
* 14:34 Daimona: Creating ce_event_goals DB table for the CampaignEvents extension in x1.testwiki, x1.test2wiki, x1.officewiki, and x1.wikishared # [[phab:T411433|T411433]]
* 14:34 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2050.codfw.wmnet
* 14:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2049.codfw.wmnet
* 14:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2049.codfw.wmnet
* 14:31 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 14:30 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2005.codfw.wmnet
* 14:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2049.codfw.wmnet
* 14:27 topranks: de-pref internet circuits landing on cr2-eqiad to shift traffic to cr1 [[phab:T420351|T420351]]
* 14:27 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2006-dev.codfw.wmnet
* 14:27 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2005-dev.codfw.wmnet
* 14:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2049.codfw.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2047.codfw.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2047.codfw.wmnet
* 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-test-presto1001.eqiad.wmnet
* 14:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-test-presto1001.eqiad.wmnet
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2047.codfw.wmnet
* 14:19 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2005-dev.codfw.wmnet
* 14:19 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb2004-dev.codfw.wmnet
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 14:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2047.codfw.wmnet
* 14:14 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2046.codfw.wmnet
* 14:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2046.codfw.wmnet
* 14:13 topranks: disable VRRP on cr2-eqiad interfaces facing ssw1-d8-eqiad [[phab:T420351|T420351]]
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 14:11 moritzm: powercycling ganeti2046 (stuck on reboot)
* 14:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
* 14:10 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudlb2004-dev.codfw.wmnet
* 14:10 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb2003-dev.codfw.wmnet
* 14:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 14:05 topranks: setting cr1-eqiad as VRRP master for all vlans [[phab:T420351|T420351]]
* 14:01 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudlb2003-dev.codfw.wmnet
* 14:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2046.codfw.wmnet
* 13:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6002.drmrs.wmnet
* 13:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6002.drmrs.wmnet
* 13:57 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 13:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2046.codfw.wmnet
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2045.codfw.wmnet
* 13:52 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudlb2002-dev.codfw.wmnet
* 13:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6002.drmrs.wmnet
* 13:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2045.codfw.wmnet
* 13:45 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254189{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]], [[gerrit:1254190{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]] (duration: 08m 10s)
* 13:44 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet
* 13:42 esanders@deploy2002: esanders: Continuing with sync
* 13:39 esanders@deploy2002: esanders: Backport for [[gerrit:1254189{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]], [[gerrit:1254190{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:38 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
* 13:38 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2004.codfw.wmnet
* 13:37 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1254189{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]], [[gerrit:1254190{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]]
* 13:35 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on logstash2023.codfw.wmnet with reason: ganeti reboot
* 13:33 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 13:32 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host apus-be2004.codfw.wmnet
* 13:32 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2045.codfw.wmnet
* 13:32 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2002.codfw.wmnet
* 13:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6002.drmrs.wmnet
* 13:30 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251610{{!}}Turn on postprocessing cache for all Parsoid parses (T348255)]] (duration: 10m 31s)
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 13:26 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be2002.codfw.wmnet
* 13:26 cscott@deploy2002: cscott: Continuing with sync
* 13:26 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2001.codfw.wmnet
* 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6001.drmrs.wmnet
* 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6001.drmrs.wmnet
* 13:22 cscott@deploy2002: cscott: Backport for [[gerrit:1251610{{!}}Turn on postprocessing cache for all Parsoid parses (T348255)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2044.codfw.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2044.codfw.wmnet
* 13:20 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be2001.codfw.wmnet
* 13:20 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker13[00-47].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 13:20 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1251610{{!}}Turn on postprocessing cache for all Parsoid parses (T348255)]]
* 13:20 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:19 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:19 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2280-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 13:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6001.drmrs.wmnet
* 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6001.drmrs.wmnet
* 13:16 btullis@cumin1003: END (PASS) - Cookbook sre.presto.reboot-workers (exit_code=0) for Presto an-presto cluster: Reboot Presto nodes
* 13:15 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2044.codfw.wmnet
* 13:15 aklapper@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254166{{!}}Remove misplaced readonly from CategoryViewer::$query (T420315)]] (duration: 06m 31s)
* 13:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint1001.eqiad.wmnet
* 13:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint1001.eqiad.wmnet
* 13:11 aklapper@deploy2002: zabe, aklapper: Continuing with sync
* 13:11 aklapper@deploy2002: zabe, aklapper: Backport for [[gerrit:1254166{{!}}Remove misplaced readonly from CategoryViewer::$query (T420315)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:10 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:10 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet
* 13:10 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 16509
* 13:09 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1004.eqiad.wmnet
* 13:09 aklapper@deploy2002: Started scap sync-world: Backport for [[gerrit:1254166{{!}}Remove misplaced readonly from CategoryViewer::$query (T420315)]]
* 13:08 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint2001.codfw.wmnet
* 13:04 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host apus-be1004.eqiad.wmnet
* 13:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint2001.codfw.wmnet
* 13:02 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2044.codfw.wmnet
* 13:02 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1002.eqiad.wmnet
* 13:01 moritzm: failover Ganeti masters in drmrs to ganeti6003/6004
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6004.drmrs.wmnet
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6004.drmrs.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2043.codfw.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2043.codfw.wmnet
* 12:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 214657
* 12:56 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 214657
* 12:56 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 56308
* 12:55 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 12:55 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 56308
* 12:55 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 28788
* 12:55 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be1002.eqiad.wmnet
* 12:55 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1001.eqiad.wmnet
* 12:54 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 28788
* 12:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6004.drmrs.wmnet
* 12:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2043.codfw.wmnet
* 12:53 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 28788
* 12:53 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 28788
* 12:53 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9269
* 12:52 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1012
* 12:52 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:51 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 9269
* 12:51 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1012
* 12:51 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-e8-eqiad
* 12:51 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-e8-eqiad
* 12:50 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:48 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1015
* 12:48 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be1001.eqiad.wmnet
* 12:45 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1015
* 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6004.drmrs.wmnet
* 12:44 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:44 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2043.codfw.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6003.drmrs.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6003.drmrs.wmnet
* 12:40 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 12:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2042.codfw.wmnet
* 12:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2042.codfw.wmnet
* 12:38 moritzm: powercycling ganeti2042 (stuck on reboot)
* 12:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6003.drmrs.wmnet
* 12:34 moritzm: powercycling ganeti2041 (stuck on reboot)
* 12:31 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6003.drmrs.wmnet
* 12:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install1005.wikimedia.org
* 12:22 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2042.codfw.wmnet
* 12:20 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-cluster
* 12:20 Emperor: roll-reboot apus frontends (codfw) for March reboots
* 12:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install1005.wikimedia.org
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install2005.wikimedia.org
* 12:13 topranks: restart BGP announcements from ssw1-d1-eqiad following change [[phab:T420180|T420180]]
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2042.codfw.wmnet
* 12:08 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2280-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 12:07 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install2005.wikimedia.org
* 12:06 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 12:06 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
* 12:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org
* 12:06 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
* 12:05 jayme@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=(registry1005.eqiad.wmnet{{!}}registry2005.codfw.wmnet)
* 12:05 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
* 12:05 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
* 12:04 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 12:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org
* 12:04 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install4003.wikimedia.org
* 12:03 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c7-eqiad [[phab:T420180|T420180]]
* 12:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2041.codfw.wmnet
* 12:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2041.codfw.wmnet
* 12:01 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
* 12:01 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
* 12:00 jayme@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=(registry1005.eqiad.wmnet{{!}}registry2005.codfw.wmnet)
* 12:00 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c6-eqiad [[phab:T420180|T420180]]
* 12:00 jayme@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=(registry1004.eqiad.wmnet{{!}}registry2004.codfw.wmnet)
* 11:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
* 11:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
* 11:59 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c4-eqiad [[phab:T420180|T420180]]
* 11:58 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c3-eqiad [[phab:T420180|T420180]]
* 11:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install4003.wikimedia.org
* 11:56 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c2-eqiad [[phab:T420180|T420180]]
* 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install5003.wikimedia.org
* 11:55 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
* 11:55 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
* 11:54 jayme@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=(registry1004.eqiad.wmnet{{!}}registry2004.codfw.wmnet)
* 11:54 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-d3-eqiad [[phab:T420180|T420180]]
* 11:53 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-d1-eqiad [[phab:T420180|T420180]]
* 11:52 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 11:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5007.eqsin.wmnet
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5007.eqsin.wmnet
* 11:49 btullis@cumin1003: START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster
* 11:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install5003.wikimedia.org
* 11:48 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org
* 11:47 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 11:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2041.codfw.wmnet
* 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5007.eqsin.wmnet
* 11:43 btullis@cumin1003: START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes
* 11:41 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet
* 11:41 cgoubert@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker13[00-47].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 11:39 topranks: stop accepting external routes on ssw1-d1-eqiad from cr1-eqiad [[phab:T420180|T420180]]
* 11:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org
* 11:33 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-cluster
* 11:33 Emperor: roll-reboot apus frontends (eqiad) for March reboots
* 11:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org
* 11:28 moritzm: failover Ganeti master in eqsin to ganeti5004
* 11:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2041.codfw.wmnet
* 11:24 topranks: reduce local-preference for BGP routes learnt from servers on cr1-eqiad [[phab:T420180|T420180]]
* 11:22 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:18 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2040.codfw.wmnet
* 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2040.codfw.wmnet
* 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2040.codfw.wmnet
* 11:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
* 11:05 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:04 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2040.codfw.wmnet
* 11:01 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5005.eqsin.wmnet
* 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5005.eqsin.wmnet
* 11:00 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:59 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:58 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:58 topranks: prepend external BGP announcements from cr1-eqiad [[phab:T420180|T420180]]
* 10:57 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:56 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:56 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2039.codfw.wmnet
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2039.codfw.wmnet
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5005.eqsin.wmnet
* 10:52 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 10:51 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:49 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:49 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:49 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5005.eqsin.wmnet
* 10:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2039.codfw.wmnet
* 10:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 10:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 10:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2039.codfw.wmnet
* 10:45 javiermonton@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 10:45 javiermonton@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2038.codfw.wmnet
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2038.codfw.wmnet
* 10:43 javiermonton@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 10:43 javiermonton@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
* 10:42 topranks: cease announcing routed networks from ssw1-d1-eqiad to cr1-eqiad in BGP [[phab:T420180|T420180]]
* 10:41 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:41 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:40 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:39 javiermonton@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 10:39 javiermonton@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 10:38 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2038.codfw.wmnet
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 10:37 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:33 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw2004-dev.codfw.wmnet
* 10:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2038.codfw.wmnet
* 10:29 topranks: stop announcing directly connected routes to L3 switches from cr1-eqiad [[phab:T420180|T420180]]
* 10:28 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:27 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudgw2004-dev.codfw.wmnet
* 10:27 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw2003-dev.codfw.wmnet
* 10:26 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:25 topranks: disable EVPN IBGP peering between ssw1-d1-eqiad and ssw1-d8-eqiad [[phab:T420180|T420180]]
* 10:24 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:21 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:21 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:20 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudgw2003-dev.codfw.wmnet
* 10:20 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:19 urbanecm: Delete `job/growthexperiments-listtaskcounts-29513771` from mw-cron (job stuck for more than a month)
* 10:19 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1003.eqiad.wmnet
* 10:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 10:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2037.codfw.wmnet
* 10:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2037.codfw.wmnet
* 10:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1003.eqiad.wmnet
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
* 10:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2037.codfw.wmnet
* 10:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
* 10:06 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2002.codfw.wmnet
* 10:05 topranks: disabling VRRP for et-1/0/5 sub-interfaces on cr1-eqiad [[phab:T420180|T420180]]
* 10:03 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:03 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:02 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:01 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:01 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:01 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:00 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2002.codfw.wmnet
* 10:00 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2001.codfw.wmnet
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:58 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2003.codfw.wmnet
* 09:57 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:57 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 09:57 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 09:56 topranks: shift traffic from codfw to eqiad off Arelion CCT to Lumen
* 09:56 mvernon@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling reboot on A:thanos-fe
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2003.codfw.wmnet
* 09:54 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2001.codfw.wmnet
* 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 09:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2037.codfw.wmnet
* 09:53 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:52 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:50 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:47 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 09:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 09:42 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 09:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2002.codfw.wmnet
* 09:38 moritzm: installing openssl bugfix updates on trixie hosts
* 09:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2002.codfw.wmnet
* 09:31 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2001.codfw.wmnet
* 09:25 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2001.codfw.wmnet
* 09:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
* 09:21 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 09:20 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
* 09:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 09:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 09:10 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254114{{!}}Revert "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] (duration: 12m 36s)
* 09:06 topranks: increase VRRP priority on eqiad vlans on CR2 to shift active gateway to cr2-eqiad [[phab:T420180|T420180]]
* 09:05 mvernon@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling reboot on A:thanos-fe
* 09:03 kharlan@deploy2002: kharlan: Continuing with sync
* 09:02 kharlan@deploy2002: kharlan: Backport for [[gerrit:1254114{{!}}Revert "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:58 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-canary
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2036.codfw.wmnet
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2036.codfw.wmnet
* 08:57 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1254114{{!}}Revert "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]]
* 08:57 moritzm: rebuilt the trixie d-i image for the 13.4 point release [[phab:T420240|T420240]]
* 08:54 kharlan@deploy2002: Sync cancelled.
* 08:52 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-canary
* 08:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2036.codfw.wmnet
* 08:49 kharlan@deploy2002: harroyo-wmf, kharlan: Backport for [[gerrit:1250575{{!}}hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend (T419125)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2036.codfw.wmnet
* 08:44 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host bast2003.wikimedia.org
* 08:43 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1250575{{!}}hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend (T419125)]]
* 08:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:42 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:35 arnaudb@cumin1003: END (PASS) - Cookbook sre.gerrit.restart-gerrit (exit_code=0) Restarting Gerrit on gerrit2002
* 08:34 arnaudb@cumin1003: START - Cookbook sre.gerrit.restart-gerrit Restarting Gerrit on gerrit2002
* 08:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 08:34 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host contint1002.wikimedia.org
* 08:32 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:28 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 08:27 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host contint1002.wikimedia.org
* 08:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2035.codfw.wmnet
* 08:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2035.codfw.wmnet
* 08:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host parsoidtest1001.eqiad.wmnet
* 08:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2035.codfw.wmnet
* 08:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti3005.esams.wmnet to cluster esams03 and group B
* 08:14 moritzm: powercycling bast2003 (stuck on reboot)
* 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti3005.esams.wmnet to cluster esams03 and group B
* 08:14 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host parsoidtest1001.eqiad.wmnet
* 08:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3005.esams.wmnet
* 08:09 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:08 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet
* 07:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
* 07:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5004.wikimedia.org
* 07:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti3005.esams.wmnet with OS bookworm
* 07:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5004.wikimedia.org
* 07:37 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 07:37 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 07:34 arnaudb@cumin1003: END (PASS) - Cookbook sre.gerrit.restart-gerrit (exit_code=0) Restarting Gerrit on gerrit2003
* 07:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2035.codfw.wmnet
* 07:32 arnaudb@cumin1003: START - Cookbook sre.gerrit.restart-gerrit Restarting Gerrit on gerrit2003
* 07:32 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2033.codfw.wmnet
* 07:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 07:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2032.codfw.wmnet
* 07:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2032.codfw.wmnet
* 07:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti3005.esams.wmnet with reason: host reimage
* 07:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti3005.esams.wmnet with reason: host reimage
* 07:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2032.codfw.wmnet
* 07:23 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2032.codfw.wmnet
* 07:00 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti3005.esams.wmnet with OS bookworm
* 06:08 kart_: Updated cxserver to 2026-03-16-071247-production ([[phab:T420004|T420004]])
* 06:07 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 06:06 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 06:05 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 06:04 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 05:58 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 05:58 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 04:41 dwisehaupt@dns1005: END - running authdns-update
* 04:39 dwisehaupt@dns1005: START - running authdns-update
* 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.17 (duration: 01m 17s)
* 03:43 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.20 refs [[phab:T413811|T413811]] (duration: 39m 34s)
* 03:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 10s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 00:26 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6009.*
* 00:25 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6009.drmrs.wmnet with OS trixie
* 00:07 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251158{{!}}Enable languages in main menu on Russian Wikipedia (T419730)]] (duration: 06m 57s)
* 00:03 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 00:02 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1251158{{!}}Enable languages in main menu on Russian Wikipedia (T419730)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1251158{{!}}Enable languages in main menu on Russian Wikipedia (T419730)]]
== 2026-03-16 ==
* 23:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6009.drmrs.wmnet with reason: host reimage
* 23:56 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253604{{!}}Don't output language HTML when no languages present (T419730)]], [[gerrit:1251157{{!}}Support duplication of languages in header and main menu (T419730)]] (duration: 06m 44s)
* 23:56 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6009.drmrs.wmnet with reason: host reimage
* 23:52 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 23:51 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1253604{{!}}Don't output language HTML when no languages present (T419730)]], [[gerrit:1251157{{!}}Support duplication of languages in header and main menu (T419730)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:50 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1253604{{!}}Don't output language HTML when no languages present (T419730)]], [[gerrit:1251157{{!}}Support duplication of languages in header and main menu (T419730)]]
* 23:36 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6009.drmrs.wmnet with OS trixie
* 23:32 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp601(0{{!}}1).*
* 22:54 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6008.drmrs.wmnet [reason: trixie reimaging]
* 22:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6008.drmrs.wmnet with OS trixie
* 22:37 jforrester@deploy2002: Finished scap sync-world: [[phab:T411807|T411807]] (duration: 11m 10s)
* 22:35 jforrester@deploy2002: jforrester: Continuing with sync
* 22:32 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6010.drmrs.wmnet with OS trixie
* 22:31 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp70[09-12].magru.wmnet<nowiki>}</nowiki> and A:cp
* 22:31 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7012.magru.wmnet
* 22:31 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 22:30 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[1-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 22:30 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7004.magru.wmnet
* 22:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6011.drmrs.wmnet with OS trixie
* 22:28 jforrester@deploy2002: jforrester: [[phab:T411807|T411807]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 jforrester@deploy2002: Started scap sync-world: [[phab:T411807|T411807]]
* 22:27 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 22:24 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 22:20 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 22:17 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1020-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 22:07 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6010.drmrs.wmnet with reason: host reimage
* 22:05 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6007.drmrs.wmnet [reason: trixie reimaging]
* 22:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6011.drmrs.wmnet with reason: host reimage
* 22:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6007.drmrs.wmnet with OS trixie
* 22:02 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6008.drmrs.wmnet with OS trixie
* 21:59 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6010.drmrs.wmnet with reason: host reimage
* 21:58 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6011.drmrs.wmnet with reason: host reimage
* 21:58 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp6008.drmrs.wmnet with OS trixie
* 21:52 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7011.magru.wmnet
* 21:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7003.magru.wmnet
* 21:42 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul1003.eqiad.wmnet with OS trixie
* 21:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 21:40 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6010.drmrs.wmnet with OS trixie
* 21:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6012.*
* 21:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6012.drmrs.wmnet with OS trixie
* 21:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6011.drmrs.wmnet with OS trixie
* 21:38 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6007.drmrs.wmnet with reason: host reimage
* 21:36 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6013.*
* 21:36 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 21:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6013.drmrs.wmnet with OS trixie
* 21:32 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6007.drmrs.wmnet with reason: host reimage
* 21:28 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul1003.eqiad.wmnet with reason: host reimage
* 21:22 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul1003.eqiad.wmnet with reason: host reimage
* 21:19 Dreamy_Jazz: Evening UTC backport window done
* 21:18 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251848{{!}}Disable CheckUser on closed wikis where no checks were ever made (T420062)]], [[gerrit:1251865{{!}}Uninstall SecurePoll from closed wikis (T420062)]], [[gerrit:1251888{{!}}DiscussionTools: Uninstall wikis closed before permalinks were deployed (T420052)]] (duration: 06m 10s)
* 21:17 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6008.drmrs.wmnet with OS trixie
* 21:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6008.drmrs.wmnet [reason: trixie reimaging]
* 21:15 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6006.drmrs.wmnet [reason: trixie reimaging]
* 21:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6006.drmrs.wmnet with OS trixie
* 21:14 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 21:14 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1251848{{!}}Disable CheckUser on closed wikis where no checks were ever made (T420062)]], [[gerrit:1251865{{!}}Uninstall SecurePoll from closed wikis (T420062)]], [[gerrit:1251888{{!}}DiscussionTools: Uninstall wikis closed before permalinks were deployed (T420052)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified the
* 21:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6012.drmrs.wmnet with reason: host reimage
* 21:12 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6007.drmrs.wmnet with OS trixie
* 21:12 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1251848{{!}}Disable CheckUser on closed wikis where no checks were ever made (T420062)]], [[gerrit:1251865{{!}}Uninstall SecurePoll from closed wikis (T420062)]], [[gerrit:1251888{{!}}DiscussionTools: Uninstall wikis closed before permalinks were deployed (T420052)]]
* 21:12 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6007.drmrs.wmnet [reason: trixie reimaging]
* 21:11 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6005.drmrs.wmnet [reason: trixie reimaging]
* 21:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6005.drmrs.wmnet with OS trixie
* 21:10 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7010.magru.wmnet
* 21:10 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7002.magru.wmnet
* 21:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6013.drmrs.wmnet with reason: host reimage
* 21:08 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul1003.eqiad.wmnet with OS trixie
* 21:07 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6012.drmrs.wmnet with reason: host reimage
* 21:06 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6013.drmrs.wmnet with reason: host reimage
* 21:05 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253623{{!}}Fix client credentials access tokens (T417278 T419921)]], [[gerrit:1253625{{!}}Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (T414338)]], [[gerrit:1253626{{!}}Configure $wgApiClientErrorSampleRate (T418957)]] (duration: 08m 06s)
* 21:01 catrope@deploy2002: matmarex, catrope: Continuing with sync
* 20:59 catrope@deploy2002: matmarex, catrope: Backport for [[gerrit:1253623{{!}}Fix client credentials access tokens (T417278 T419921)]], [[gerrit:1253625{{!}}Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (T414338)]], [[gerrit:1253626{{!}}Configure $wgApiClientErrorSampleRate (T418957)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:57 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1253623{{!}}Fix client credentials access tokens (T417278 T419921)]], [[gerrit:1253625{{!}}Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (T414338)]], [[gerrit:1253626{{!}}Configure $wgApiClientErrorSampleRate (T418957)]]
* 20:54 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:54 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cp[2027-2040].codfw.wmnet
* 20:50 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2027-2040].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:50 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2027-2040].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage
* 20:48 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6012.drmrs.wmnet with OS trixie
* 20:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6013.drmrs.wmnet with OS trixie
* 20:45 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2042.codfw.wmnet with reason: Testing hosts - not for production
* 20:45 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6005.drmrs.wmnet with reason: host reimage
* 20:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2042.codfw.wmnet with OS trixie
* 20:44 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephmon2007-dev.codfw.wmnet with OS bookworm
* 20:44 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 20:44 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253566{{!}}Configure external link aggregate usage on 12 wikis for top domains (T419837)]] (duration: 06m 59s)
* 20:43 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage
* 20:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2041.codfw.wmnet with reason: Testing hosts - not for production
* 20:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6005.drmrs.wmnet with reason: host reimage
* 20:40 kharlan@deploy2002: kharlan, mszwarc: Continuing with sync
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2041.codfw.wmnet with OS trixie
* 20:39 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 20:38 kharlan@deploy2002: kharlan, mszwarc: Backport for [[gerrit:1253566{{!}}Configure external link aggregate usage on 12 wikis for top domains (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:37 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1253566{{!}}Configure external link aggregate usage on 12 wikis for top domains (T419837)]]
* 20:34 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6014.*
* 20:33 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:33 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:32 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253551{{!}}Fix double post-processing in legacy preview case (T419908)]] (duration: 06m 52s)
* 20:29 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7009.magru.wmnet
* 20:28 cscott@deploy2002: cscott: Continuing with sync
* 20:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7001.magru.wmnet
* 20:27 cscott@deploy2002: cscott: Backport for [[gerrit:1253551{{!}}Fix double post-processing in legacy preview case (T419908)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:26 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1253551{{!}}Fix double post-processing in legacy preview case (T419908)]]
* 20:22 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 20:22 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6006.drmrs.wmnet with OS trixie
* 20:21 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6006.drmrs.wmnet [reason: trixie reimaging]
* 20:21 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6004.drmrs.wmnet [reason: trixie reimaging]
* 20:21 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6005.drmrs.wmnet with OS trixie
* 20:20 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6004.drmrs.wmnet with OS trixie
* 20:20 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6005.drmrs.wmnet [reason: trixie reimaging]
* 20:19 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6003.drmrs.wmnet [reason: trixie reimaging]
* 20:19 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephmon2007-dev.codfw.wmnet with reason: host reimage
* 20:19 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp70[09-12].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:18 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[1-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:17 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248665{{!}}Enable passwordless login in production (T419198)]], [[gerrit:1253572{{!}}Instrument clicks on external links to selected domains (T419837)]] (duration: 06m 43s)
* 20:16 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:15 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 20:15 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephmon2007-dev.codfw.wmnet with reason: host reimage
* 20:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6003.drmrs.wmnet with OS trixie
* 20:13 catrope@deploy2002: kharlan, catrope: Continuing with sync
* 20:12 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:12 catrope@deploy2002: kharlan, catrope: Backport for [[gerrit:1248665{{!}}Enable passwordless login in production (T419198)]], [[gerrit:1253572{{!}}Instrument clicks on external links to selected domains (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:12 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 20:11 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 20:10 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1248665{{!}}Enable passwordless login in production (T419198)]], [[gerrit:1253572{{!}}Instrument clicks on external links to selected domains (T419837)]]
* 20:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6014.drmrs.wmnet with OS trixie
* 20:03 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[2027-2040].codfw.wmnet
* 20:01 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251589{{!}}Uninstall GlobalBlocking from closed wikis (T420062)]] (duration: 08m 20s)
* 19:57 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 19:55 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6004.drmrs.wmnet with reason: host reimage
* 19:55 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephmon2007-dev.codfw.wmnet with OS bookworm
* 19:54 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1251589{{!}}Uninstall GlobalBlocking from closed wikis (T420062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:54 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp2042.codfw.wmnet with OS trixie
* 19:53 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephmon2007-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:53 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS trixie
* 19:52 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1251589{{!}}Uninstall GlobalBlocking from closed wikis (T420062)]]
* 19:52 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host cloudcephmon2007-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:51 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251582{{!}}Uninstall AbuseFilter from closed wikis with no AbuseFilter logs (T420063)]] (duration: 09m 26s)
* 19:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6003.drmrs.wmnet with reason: host reimage
* 19:47 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 19:47 mutante: releases2003 - rm rsync-srv-org-wikimedia-releases-releases2003.* - alerts flapping since server reboot - puppet code needs to be improved to ensure units are removed when primary server is switched ([[phab:T420246|T420246]])
* 19:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6004.drmrs.wmnet with reason: host reimage
* 19:46 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6003.drmrs.wmnet with reason: host reimage
* 19:44 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1251582{{!}}Uninstall AbuseFilter from closed wikis with no AbuseFilter logs (T420063)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6014.drmrs.wmnet with reason: host reimage
* 19:42 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1251582{{!}}Uninstall AbuseFilter from closed wikis with no AbuseFilter logs (T420063)]]
* 19:41 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudcephmon2007-dev
* 19:41 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudcephmon2007-dev
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating cloudcephmon2007-dev in codfw - jhancock@cumin2002"
* 19:40 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6014.drmrs.wmnet with reason: host reimage
* 19:39 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253622{{!}}Revert "Media: Use previous step for non-standard width between steps and original" (T419927)]] (duration: 07m 10s)
* 19:35 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 19:34 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1253622{{!}}Revert "Media: Use previous step for non-standard width between steps and original" (T419927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:32 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating cloudcephmon2007-dev in codfw - jhancock@cumin2002"
* 19:32 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253622{{!}}Revert "Media: Use previous step for non-standard width between steps and original" (T419927)]]
* 19:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp404[5-6].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 19:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet
* 19:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6004.drmrs.wmnet with OS trixie
* 19:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 19:27 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6004.drmrs.wmnet [reason: trixie reimaging]
* 19:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6003.drmrs.wmnet with OS trixie
* 19:26 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6003.drmrs.wmnet [reason: trixie reimaging]
* 19:25 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6002.drmrs.wmnet [reason: trixie reimaging]
* 19:25 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6001.drmrs.wmnet [reason: trixie reimaging]
* 19:21 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6014.drmrs.wmnet with OS trixie
* 19:17 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6002.drmrs.wmnet with OS trixie
* 19:17 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2042.codfw.wmnet with reason: Testing hosts - not for production
* 19:16 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2041.codfw.wmnet with reason: Testing hosts - not for production
* 19:15 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:15 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:12 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6001.drmrs.wmnet with OS trixie
* 19:02 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:02 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 18:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp4046.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:57 cdobbins@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet
* 18:52 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6002.drmrs.wmnet with reason: host reimage
* 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4045.ulsfo.wmnet
* 18:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6001.drmrs.wmnet with reason: host reimage
* 18:47 cdobbins@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp4046.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6002.drmrs.wmnet with reason: host reimage
* 18:45 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6001.drmrs.wmnet with reason: host reimage
* 18:39 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp404[5-6].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:38 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6015.*
* 18:38 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp403[7-9].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4039.ulsfo.wmnet
* 18:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6015.drmrs.wmnet with OS trixie
* 18:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6002.drmrs.wmnet with OS trixie
* 18:26 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6002.drmrs.wmnet [reason: trixie reimaging]
* 18:26 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6001.drmrs.wmnet with OS trixie
* 18:24 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6001.drmrs.wmnet [reason: trixie reimaging]
* 18:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage
* 17:59 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage
* 17:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4038.ulsfo.wmnet
* 17:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6015.drmrs.wmnet with OS trixie
* 17:37 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6016.*
* 17:32 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2042.codfw.wmnet with OS trixie
* 17:18 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4037.ulsfo.wmnet
* 17:08 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 17:06 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp403[7-9].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 17:03 fabfur@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 17:02 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6016.drmrs.wmnet with OS trixie
* 16:57 mutante: contint2002 - rebooting
* 16:47 mutante: phab2002 - rebooting
* 16:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:44 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253520{{!}}Add APCOND_OATH_HAS2FA to UserRequirementsPrivateConditions]] (duration: 06m 15s)
* 16:42 mutante: rebooting backends of releases.wikimedia.org
* 16:42 fabfur@cumin1003: START - Cookbook sre.hosts.reimage for host cp2042.codfw.wmnet with OS trixie
* 16:41 fabfur: reimage cp2042 for HAProxy testing ([[phab:T419825|T419825]])
* 16:41 mszwarc@deploy2002: mszwarc: Continuing with sync
* 16:40 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1253520{{!}}Add APCOND_OATH_HAS2FA to UserRequirementsPrivateConditions]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:39 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2041.codfw.wmnet with OS trixie
* 16:38 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1253520{{!}}Add APCOND_OATH_HAS2FA to UserRequirementsPrivateConditions]]
* 16:37 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1020-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 16:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6016.drmrs.wmnet with reason: host reimage
* 16:32 milimetric: my bad, accidentally merged https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1250249, will read docs on config deployment better
* 16:31 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1012
* 16:27 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1012
* 16:27 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6016.drmrs.wmnet with reason: host reimage
* 16:20 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]] (duration: 07m 28s)
* 16:17 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
* 16:16 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 16:14 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:13 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2005.codfw.wmnet
* 16:12 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]]
* 16:12 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
* 16:11 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
* 16:11 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=codfw
* 16:11 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 16:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
* 16:09 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1024.eqiad.wmnet
* 16:09 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1024.eqiad.wmnet
* 16:09 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1024.eqiad.wmnet
* 16:07 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1004-1007,1011-1012,1015-1016,1019-1021,1029-1031,1034-1168,1240-1289,1291-1327].eqiad.wmnet
* 16:06 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1004-1007,1011-1012,1015-1016,1019-1021,1029-1031,1034-1168,1240-1289,1291-1327].eqiad.wmnet
* 16:06 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6016.drmrs.wmnet with OS trixie
* 16:06 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2005.codfw.wmnet
* 16:06 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
* 16:05 dwisehaupt@dns1006: END - running authdns-update
* 16:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
* 16:05 fabfur@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 16:04 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=codfw
* 16:04 dwisehaupt@dns1006: START - running authdns-update
* 16:04 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=eqiad
* 16:00 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1004-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 15:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
* 15:59 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2031.codfw.wmnet
* 15:59 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti2031.codfw.wmnet
* 15:54 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2004.codfw.wmnet
* 15:53 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
* 15:52 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=eqiad
* 15:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
* 15:47 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2004.codfw.wmnet
* 15:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2031.codfw.wmnet
* 15:47 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
* 15:46 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1024.eqiad.wmnet with reason: Rebooting clouddb1024 [[phab:T419960|T419960]]
* 15:44 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1024.eqiad.wmnet
* 15:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2031.codfw.wmnet
* 15:43 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1023.eqiad.wmnet
* 15:43 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1023.eqiad.wmnet
* 15:43 fabfur@cumin1003: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS trixie
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet
* 15:42 fabfur: reimage cp2041 for HAProxy testing ([[phab:T419825|T419825]])
* 15:42 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:41 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:41 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2003.codfw.wmnet
* 15:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:38 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
* 15:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:37 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Rebooting clouddb1023 [[phab:T419960|T419960]]
* 15:35 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1022.eqiad.wmnet
* 15:35 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1022.eqiad.wmnet
* 15:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet
* 15:32 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2003.codfw.wmnet
* 15:32 dwisehaupt@dns1006: END - running authdns-update
* 15:32 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2002.codfw.wmnet
* 15:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 15:31 dwisehaupt@dns1006: START - running authdns-update
* 15:27 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe-codfw
* 15:26 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
* 15:26 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2029.codfw.wmnet
* 15:26 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti2029.codfw.wmnet
* 15:26 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:26 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:24 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:24 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:22 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2002.codfw.wmnet
* 15:21 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Rebooting clouddb1023 [[phab:T419960|T419960]]
* 15:20 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2001.codfw.wmnet
* 15:20 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]]
* 15:16 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Rebooting clouddb1022 [[phab:T419960|T419960]]
* 15:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet
* 15:11 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling reboot on A:durum and A:durum
* 15:09 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet
* 15:04 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2001.codfw.wmnet
* 15:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:02 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling reboot on A:wikidough
* 15:01 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw1004.eqiad.wmnet
* 15:01 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 15:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2028.codfw.wmnet
* 15:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2028.codfw.wmnet
* 14:56 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:55 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:55 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:54 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]]
* 14:53 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudgw1004.eqiad.wmnet
* 14:53 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-codfw
* 14:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2028.codfw.wmnet
* 14:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2028.codfw.wmnet
* 14:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:50 mvernon@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe-eqiad
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw1003.eqiad.wmnet with OS trixie
* 14:31 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:30 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:26 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:22 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1002-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1002-1003].eqiad.wmnet
* 14:22 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1002-1003].eqiad.wmnet
* 14:21 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1002-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:21 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1002-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:20 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:20 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:20 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:18 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253461{{!}}fix(anon warning): remove wring type=signup param (T415160)]], [[gerrit:1253450{{!}}AccountCreation: track account registrations for WE1.8 experiments (T416100)]] (duration: 09m 16s)
* 14:17 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:17 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:14 sgimeno@deploy2002: sgimeno: Continuing with sync
* 14:13 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:13 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:11 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:11 sgimeno@deploy2002: sgimeno: Backport for [[gerrit:1253461{{!}}fix(anon warning): remove wring type=signup param (T415160)]], [[gerrit:1253450{{!}}AccountCreation: track account registrations for WE1.8 experiments (T416100)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:11 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:10 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1003.eqiad.wmnet with reason: host reimage
* 14:09 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1253461{{!}}fix(anon warning): remove wring type=signup param (T415160)]], [[gerrit:1253450{{!}}AccountCreation: track account registrations for WE1.8 experiments (T416100)]]
* 14:08 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2026.codfw.wmnet
* 14:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 14:08 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:04 arnaudb@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gerrit2002.wikimedia.org with reason: testing
* 14:03 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1003.eqiad.wmnet with reason: host reimage
* 14:02 arnaudb@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 2:00:00 on gerrit2002.wikimedia.org with reason: [[phab:T418256|T418256]]
* 14:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1003.eqiad.wmnet
* 13:58 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 13:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1003.eqiad.wmnet
* 13:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet
* 13:45 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253046{{!}}bowiki: update logos (T419268)]] (duration: 06m 17s)
* 13:45 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudgw1003.eqiad.wmnet with OS trixie
* 13:43 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-codfw
* 13:43 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 13:41 mszwarc@deploy2002: mszwarc, anzx: Continuing with sync
* 13:41 mszwarc@deploy2002: mszwarc, anzx: Backport for [[gerrit:1253046{{!}}bowiki: update logos (T419268)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet
* 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2004.codfw.wmnet
* 13:39 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1253046{{!}}bowiki: update logos (T419268)]]
* 13:38 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253423{{!}}Always use external actor for interwiki rights logs on target wiki (T6055)]] (duration: 08m 53s)
* 13:34 mszwarc@deploy2002: mszwarc: Continuing with sync
* 13:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2004.codfw.wmnet
* 13:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2026.codfw.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2025.codfw.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet
* 13:31 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1253423{{!}}Always use external actor for interwiki rights logs on target wiki (T6055)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 13:29 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1253423{{!}}Always use external actor for interwiki rights logs on target wiki (T6055)]]
* 13:28 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 13:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet
* 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3004.esams.wmnet
* 13:25 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling reboot on A:wikidough
* 13:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 13:22 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling reboot on A:durum and A:durum
* 13:21 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2025.codfw.wmnet
* 13:21 XioNoX: drain edgeuno transit for optic replacement - [[phab:T415743|T415743]]
* 13:19 cgoubert@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host wikikube-ctrl1004.eqiad.wmnet
* 13:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3004.esams.wmnet
* 13:14 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251487{{!}}Replace direct BagOStuff with WANObjectCache (T419666)]] (duration: 11m 25s)
* 13:11 bwojtowicz@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet
* 13:09 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti3005.esams.wmnet
* 13:09 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ganeti3005.esams.wmnet
* 13:07 jforrester@deploy2002: jforrester: Continuing with sync
* 13:06 jforrester@deploy2002: jforrester: Backport for [[gerrit:1251487{{!}}Replace direct BagOStuff with WANObjectCache (T419666)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl1004.eqiad.wmnet
* 13:04 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ncredir4002.ulsfo.wmnet
* 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 13:03 jiji@cumin1003: END (ERROR) - Cookbook sre.memcached.roll-reboot-restart (exit_code=97) rolling reboot on A:memcached-gutter-eqiad
* 13:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 13:03 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1251487{{!}}Replace direct BagOStuff with WANObjectCache (T419666)]]
* 13:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5002.eqsin.wmnet
* 12:51 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl1003.eqiad.wmnet
* 12:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow5002.eqsin.wmnet
* 12:48 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet
* 12:44 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3005.esams.wmnet
* 12:42 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl1003.eqiad.wmnet
* 12:41 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl1002.eqiad.wmnet
* 12:40 aikochou@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet
* 12:37 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ncredir4002.ulsfo.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ncredir4001.ulsfo.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:35 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:34 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow7002.magru.wmnet
* 12:32 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:32 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:32 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:28 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1017
* 12:27 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow7002.magru.wmnet
* 12:27 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1017
* 12:25 aikochou@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:25 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl1002.eqiad.wmnet
* 12:22 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:20 moritzm: failover Ganeti master in esams to ganeti3008
* 12:20 moritzm: failover Ganeti master in esams to ganeti3005
* 12:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:14 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:10 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ncredir4001.ulsfo.wmnet
* 12:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti3006.esams.wmnet
* 12:00 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti3006.esams.wmnet
* 11:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for druid[1009-1013].eqiad.wmnet
* 11:57 btullis@cumin1003: START - Cookbook sre.hosts.remove-downtime for druid[1009-1013].eqiad.wmnet
* 11:57 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.remove-downtime (exit_code=97) for druid[1009-1013].eqiad.wmnet
* 11:57 btullis@cumin1003: START - Cookbook sre.hosts.remove-downtime for druid[1009-1013].eqiad.wmnet
* 11:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1009.eqiad.wmnet with OS bookworm
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3006.esams.wmnet
* 11:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1010.eqiad.wmnet with OS bookworm
* 11:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1011.eqiad.wmnet with OS bookworm
* 11:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1012.eqiad.wmnet with OS bookworm
* 11:30 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3006.esams.wmnet
* 11:29 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1013.eqiad.wmnet with OS bookworm
* 11:24 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:24 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:22 btullis@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on dse-k8s-worker[1012,1015-1017].eqiad.wmnet with reason: Adding 10 Gbps NIC
* 11:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1009.eqiad.wmnet with reason: host reimage
* 11:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1010.eqiad.wmnet with reason: host reimage
* 11:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:14 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:12 mvernon@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe-eqiad
* 11:12 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe-codfw
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1011.eqiad.wmnet with reason: host reimage
* 11:10 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:10 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1012.eqiad.wmnet with reason: host reimage
* 11:07 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit2003.wikimedia.org
* 11:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1013.eqiad.wmnet with reason: host reimage
* 11:06 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 11:06 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 11:04 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1010.eqiad.wmnet with reason: host reimage
* 11:02 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1011.eqiad.wmnet with reason: host reimage
* 11:02 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1009.eqiad.wmnet with reason: host reimage
* 11:01 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1012.eqiad.wmnet with reason: host reimage
* 11:00 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit2003.wikimedia.org
* 10:57 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1013.eqiad.wmnet with reason: host reimage
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2010.codfw.wmnet
* 10:47 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1013.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1012.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1011.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1010.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1009.eqiad.wmnet with OS bookworm
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3007.esams.wmnet
* 10:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2010.codfw.wmnet
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3007.esams.wmnet
* 10:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3007.esams.wmnet
* 10:29 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3007.esams.wmnet
* 10:28 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:28 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3008.esams.wmnet
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3008.esams.wmnet
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2009.codfw.wmnet
* 10:24 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:24 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:23 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2009.codfw.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3008.esams.wmnet
* 10:14 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3008.esams.wmnet
* 10:09 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:08 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2004.codfw.wmnet
* 10:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7001.magru.wmnet
* 10:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet
* 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2004.codfw.wmnet
* 09:56 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet
* 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts tcp-proxy4002.ulsfo.wmnet
* 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 09:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 09:51 elukey@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 09:51 elukey@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 09:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7001.magru.wmnet
* 09:51 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:46 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts tcp-proxy4002.ulsfo.wmnet
* 09:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: decom tcp-proxy4001 - jmm@cumin2002"
* 09:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: decom tcp-proxy4001 - jmm@cumin2002"
* 09:43 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm2001.wikimedia.org
* 09:39 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idm2001.wikimedia.org
* 09:38 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:38 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 09:38 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 09:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:35 slyngshede@dns1004: END - running authdns-update
* 09:34 elukey@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 09:34 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:34 elukey@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 09:33 slyngshede@dns1004: START - running authdns-update
* 09:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:30 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm1001.wikimedia.org
* 09:26 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 09:26 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idm1001.wikimedia.org
* 09:24 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm-test1001.wikimedia.org
* 09:22 moritzm: failover Ganeti master in magru to ganeti7004
* 09:21 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts tcp-proxy4001.ulsfo.wmnet
* 09:21 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-test-eqiad
* 09:20 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idm-test1001.wikimedia.org
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2002.codfw.wmnet
* 09:18 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:15 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM cloudidp2001-dev.codfw.wmnet
* 09:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2002.codfw.wmnet
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7002.magru.wmnet
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet
* 09:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2001.codfw.wmnet
* 09:13 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts tcp-proxy4001.ulsfo.wmnet
* 09:11 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM cloudidp2001-dev.codfw.wmnet
* 09:09 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp2005.wikimedia.org
* 09:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2001.codfw.wmnet
* 09:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet
* 09:05 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp2005.wikimedia.org
* 09:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7002.magru.wmnet
* 09:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet
* 08:59 slyngshede@dns1004: END - running authdns-update
* 08:58 slyngshede@dns1004: START - running authdns-update
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7003.magru.wmnet
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet
* 08:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet
* 08:49 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp1005.wikimedia.org
* 08:48 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:48 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 08:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet
* 08:48 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 08:47 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7003.magru.wmnet
* 08:44 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp1005.wikimedia.org
* 08:44 jmm@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-test-eqiad
* 08:44 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 08:39 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp-test1005.wikimedia.org
* 08:35 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp-test1005.wikimedia.org
* 08:33 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp-test2005.wikimedia.org
* 08:29 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp-test2005.wikimedia.org
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7004.magru.wmnet
* 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7004.magru.wmnet
* 08:22 taavi@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3
* 08:18 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251276{{!}}Fix broken survey links on PersonalDashboard (T419950)]] (duration: 32m 09s)
* 08:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7004.magru.wmnet
* 08:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7004.magru.wmnet
* 08:06 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 08:05 kgraessle@deploy2002: kgraessle: Continuing with sync
* 08:04 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1251276{{!}}Fix broken survey links on PersonalDashboard (T419950)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:59 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 07:52 moritzm: installing Linux 5.10.251 on Bullseye hosts
* 07:45 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1251276{{!}}Fix broken survey links on PersonalDashboard (T419950)]]
* 07:37 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stewards1001.eqiad.wmnet
* 07:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host stewards1001.eqiad.wmnet
* 07:33 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
* 07:26 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
* 07:25 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aphlict1002.eqiad.wmnet
* 07:21 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host aphlict1002.eqiad.wmnet
* 07:10 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host doc2003.codfw.wmnet
* 07:06 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host doc2003.codfw.wmnet
* 07:02 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
* 06:55 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
* 05:25 aikochou@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 52s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-15 ==
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 52s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-14 ==
* 14:16 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251941{{!}}CommonSettings: Set class in $wgCentralAuthRC]] (duration: 06m 17s)
* 14:12 reedy@deploy2002: reedy: Continuing with sync
* 14:11 reedy@deploy2002: reedy: Backport for [[gerrit:1251941{{!}}CommonSettings: Set class in $wgCentralAuthRC]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:10 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1251941{{!}}CommonSettings: Set class in $wgCentralAuthRC]]
* 12:51 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251912{{!}}CommonSettings: Specify class in IRC RCFeed setup]] (duration: 06m 19s)
* 12:47 reedy@deploy2002: reedy, lcawte: Continuing with sync
* 12:46 reedy@deploy2002: reedy, lcawte: Backport for [[gerrit:1251912{{!}}CommonSettings: Specify class in IRC RCFeed setup]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:44 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1251912{{!}}CommonSettings: Specify class in IRC RCFeed setup]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 00s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-13 ==
* 22:52 taavi: taavi@deploy2002 ~ $ mwscript CentralAuth:attachAccount.php --wiki=metawiki --userlist backfiller.txt # unify unified Special:CentralAuth/MediaWikiAccountBackfiller on meta
* 20:07 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 20:01 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 20:01 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 19:55 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4052.*
* 19:54 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 19:54 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS trixie
* 19:53 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2005-dev.codfw.wmnet
* 19:46 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2005-dev.codfw.wmnet
* 19:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 19:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4050.*
* 19:40 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 19:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 19:24 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 19:23 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4050.ulsfo.wmnet
* 19:19 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1035.eqiad.wmnet with OS trixie
* 19:19 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1034.eqiad.wmnet with OS trixie
* 19:18 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:18 jclark@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:18 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:16 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4051.*
* 19:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp4050.ulsfo.wmnet
* 19:14 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4050.ulsfo.wmnet
* 19:13 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:11 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4051.ulsfo.wmnet with OS trixie
* 19:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp4050.ulsfo.wmnet
* 19:02 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1035.eqiad.wmnet with reason: host reimage
* 19:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS bookworm
* 19:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:00 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 18:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1034.eqiad.wmnet with reason: host reimage
* 18:58 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie
* 18:57 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4052.ulsfo.wmnet with OS trixie
* 18:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1035.eqiad.wmnet with reason: host reimage
* 18:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1034.eqiad.wmnet with reason: host reimage
* 18:47 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4051.ulsfo.wmnet with reason: host reimage
* 18:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 18:41 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4051.ulsfo.wmnet with reason: host reimage
* 18:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1035.eqiad.wmnet with OS trixie
* 18:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1034.eqiad.wmnet with OS trixie
* 18:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1033.eqiad.wmnet with OS trixie
* 18:36 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp4050.ulsfo.wmnet with reason: firmware updates
* 18:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie
* 18:24 brett@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp4050.ulsfo.wmnet
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie
* 18:22 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS bookworm
* 18:21 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1374.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 18:21 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4051.ulsfo.wmnet with OS trixie
* 18:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4051.ulsfo.wmnet with OS trixie
* 18:12 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS bookworm
* 18:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1374.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 18:10 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:10 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 18:10 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 18:10 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1253.eqiad.wmnet with reason: Host went down and paged, depooled
* 18:06 cgoubert@cumin1003: dbctl commit (dc=all): 'Depool db1253', diff saved to https://phabricator.wikimedia.org/P89856 and previous config saved to /var/cache/conftool/dbconfig/20260313-180640-cgoubert.json
* 18:06 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 18:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4051.ulsfo.wmnet with OS trixie
* 18:03 elukey: powercycle db1253 - host not reachable via ssh, no events logged in racadm getsel, no console com2 available (blank screen)
* 17:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 17:56 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 17:49 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4049.*
* 17:46 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4049.ulsfo.wmnet with OS trixie
* 17:37 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:37 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:36 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 17:35 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4050.ulsfo.wmnet with OS trixie
* 17:35 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:34 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:27 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 17:26 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:26 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 17:26 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 17:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4049.ulsfo.wmnet with reason: host reimage
* 17:17 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 17:17 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:16 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:16 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4049.ulsfo.wmnet with reason: host reimage
* 17:12 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1016.eqiad.wmnet
* 17:12 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1016.eqiad.wmnet
* 17:11 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1016.eqiad.wmnet
* 17:11 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4048.*
* 17:10 dhinus: (relogging failed sal) conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet
* 17:10 dhinus: (relogging failed sal) DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1016.eqiad.wmnet with reason: Rebooting clouddb1016 [[phab:T419960|T419960]]
* 17:09 dhinus: (relogging failed sal) END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1015.eqiad.wmnet
* 17:08 dhinus: (relogging failed sal) START - Cookbook sre.hosts.remove-downtime for clouddb1015.eqiad.wmnet
* 17:08 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 17:07 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:07 dhinus: fnegri@cumin1003 conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet
* 17:07 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 17:07 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4048.ulsfo.wmnet with OS trixie
* 17:06 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 16:40 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4049.ulsfo.wmnet with OS trixie
* 16:39 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage
* 16:36 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet
* 16:35 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1015.eqiad.wmnet with reason: Rebooting clouddb1015 [[phab:T419960|T419960]]
* 16:34 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1014.eqiad.wmnet
* 16:34 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1014.eqiad.wmnet
* 16:34 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
* 16:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
* 16:28 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudweb1003.wikimedia.org
* 16:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
* 16:22 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudweb1003.wikimedia.org
* 16:21 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudweb1004.wikimedia.org
* 16:20 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1014.eqiad.wmnet with reason: Rebooting clouddb1014 [[phab:T419960|T419960]]
* 16:20 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1014.eqiad.wmnet
* 16:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor-dev2001.codfw.wmnet
* 16:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4048.ulsfo.wmnet with OS trixie
* 16:16 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudweb1004.wikimedia.org
* 16:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor-dev2001.codfw.wmnet
* 16:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor2003.codfw.wmnet
* 16:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet
* 16:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:00 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie
* 15:43 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2010-dev.codfw.wmnet
* 15:38 vgutierrez@cumin1003: END (PASS) - Cookbook sre.loadbalancer.check-ipip (exit_code=0)
* 15:38 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:37 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 15:37 vgutierrez@cumin1003: END (FAIL) - Cookbook sre.loadbalancer.check-ipip (exit_code=99)
* 15:37 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:36 vgutierrez@cumin1003: END (FAIL) - Cookbook sre.loadbalancer.check-ipip (exit_code=99)
* 15:36 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:36 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2010-dev.codfw.wmnet
* 15:35 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet
* 15:35 vgutierrez@cumin1003: END (FAIL) - Cookbook sre.loadbalancer.check-ipip (exit_code=99)
* 15:35 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:28 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
* 15:26 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:25 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:23 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:22 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 15:22 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 15:22 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 15:19 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:19 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:19 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:16 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2006-dev.codfw.wmnet
* 15:12 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudidp2001-dev.codfw.wmnet
* 15:08 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudidp2001-dev.codfw.wmnet
* 15:07 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2006-dev.codfw.wmnet
* 14:58 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
* 14:58 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
* 14:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s3
* 14:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s1
* 14:48 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1015.eqiad.wmnet
* 14:46 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet
* 14:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2003.codfw.wmnet
* 14:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1373.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1015.eqiad.wmnet
* 14:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1034.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:42 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:40 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1023
* 14:40 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1023
* 14:40 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1022
* 14:40 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1022
* 14:40 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1021
* 14:39 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup2004.codfw.wmnet
* 14:39 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1021
* 14:38 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
* 14:37 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1020
* 14:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2003.codfw.wmnet
* 14:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:35 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1020
* 14:35 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T419960|T419960]]
* 14:33 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:33 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
* 14:32 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:32 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:32 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1034.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1373.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:29 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:29 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt - jclark@cumin1003"
* 14:29 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt - jclark@cumin1003"
* 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti-test2002.codfw.wmnet
* 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 14:27 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup2004.codfw.wmnet
* 14:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cuminunpriv1001.eqiad.wmnet
* 14:25 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup2003.codfw.wmnet
* 14:25 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 14:25 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 14:24 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 14:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cuminunpriv1001.eqiad.wmnet
* 14:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 14:22 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1004.eqiad.wmnet
* 14:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti-test2002.codfw.wmnet
* 14:14 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup2003.codfw.wmnet
* 14:13 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup1004.eqiad.wmnet
* 14:09 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1003.eqiad.wmnet
* 14:01 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup1003.eqiad.wmnet
* 13:59 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit1003.wikimedia.org
* 13:53 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit1003.wikimedia.org
* 13:49 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists2001.wikimedia.org
* 13:48 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1004.eqiad.wmnet
* 13:46 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host etherpad1004.eqiad.wmnet
* 13:45 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:45 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:44 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1004.eqiad.wmnet
* 13:42 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists2001.wikimedia.org
* 13:42 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host etherpad1004.eqiad.wmnet
* 13:37 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host etherpad2002.codfw.wmnet
* 13:36 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit2002.wikimedia.org
* 13:33 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host etherpad2002.codfw.wmnet
* 13:32 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2003.wikimedia.org
* 13:30 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit2002.wikimedia.org
* 13:26 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2003.wikimedia.org
* 13:26 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 13:24 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2020.codfw.wmnet
* 13:23 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2019.codfw.wmnet
* 13:19 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 13:19 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org
* 13:13 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2020.codfw.wmnet
* 13:13 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org
* 13:12 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2019.codfw.wmnet
* 13:11 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.reboot-runner (exit_code=0) rolling reboot on A:gitlab-runner
* 13:05 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2018.codfw.wmnet
* 13:05 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1020.eqiad.wmnet
* 12:54 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2018.codfw.wmnet
* 12:54 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1020.eqiad.wmnet
* 12:54 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2017.codfw.wmnet
* 12:54 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1019.eqiad.wmnet
* 12:53 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:50 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:50 moritzm: powercycle pki1002
* 12:48 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:47 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:44 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:44 mutante: rebooted phab1005 - waiting for it to come back
* 12:44 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2017.codfw.wmnet
* 12:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1019.eqiad.wmnet
* 12:42 bwojtowicz@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:40 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1018.eqiad.wmnet
* 12:39 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2016.codfw.wmnet
* 12:31 jelto@cumin1003: START - Cookbook sre.gitlab.reboot-runner rolling reboot on A:gitlab-runner
* 12:29 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1018.eqiad.wmnet
* 12:29 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1017.eqiad.wmnet
* 12:28 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2016.codfw.wmnet
* 12:27 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2015.codfw.wmnet
* 12:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast1004.wikimedia.org
* 12:18 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host doc1004.eqiad.wmnet
* 12:18 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1017.eqiad.wmnet
* 12:17 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:17 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:15 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2015.codfw.wmnet
* 12:15 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:15 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:14 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host doc1004.eqiad.wmnet
* 12:13 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aphlict2001.codfw.wmnet
* 12:10 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host aphlict2001.codfw.wmnet
* 12:10 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: reboot
* 12:10 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts2002.codfw.wmnet
* 12:07 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:07 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:03 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts2002.codfw.wmnet
* 12:02 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:02 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:01 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1016.eqiad.wmnet
* 12:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1019.eqiad.wmnet
* 11:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1018.eqiad.wmnet
* 11:59 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:59 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1019.eqiad.wmnet
* 11:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1018.eqiad.wmnet
* 11:51 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:51 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:50 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1016.eqiad.wmnet
* 11:49 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup2004.codfw.wmnet
* 11:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup2004.codfw.wmnet
* 11:43 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup1004.eqiad.wmnet
* 11:37 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup1004.eqiad.wmnet
* 11:36 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup2003.codfw.wmnet
* 11:34 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup1003.eqiad.wmnet
* 11:32 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 11:32 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:30 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup2003.codfw.wmnet
* 11:28 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup1003.eqiad.wmnet
* 11:27 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:26 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy1001.eqiad.wmnet
* 11:21 arnaudb@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host contint1003.wikimedia.org
* 11:21 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy1001.eqiad.wmnet
* 11:21 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy1002.eqiad.wmnet
* 11:16 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy1002.eqiad.wmnet
* 11:16 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy2001.codfw.wmnet
* 11:16 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host contint1003.wikimedia.org
* 11:12 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-master-codfw
* 11:12 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul1001.eqiad.wmnet
* 11:11 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy2001.codfw.wmnet
* 11:11 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy2002.codfw.wmnet
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1018.eqiad.wmnet with reason: host reimage
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:09 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-master-eqiad
* 11:08 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:08 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul1001.eqiad.wmnet
* 11:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1008-dev.eqiad.wmnet
* 11:07 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy2002.codfw.wmnet
* 11:06 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy3001.esams.wmnet
* 11:05 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1018.eqiad.wmnet with reason: host reimage
* 11:01 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy3001.esams.wmnet
* 11:01 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1008-dev.eqiad.wmnet
* 11:01 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1002-dev.eqiad.wmnet
* 11:01 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy3002.esams.wmnet
* 10:59 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 22:00:00 on db1258.eqiad.wmnet with reason: depooled, likely to flap over the weekend
* 10:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudbackup1002-dev.eqiad.wmnet
* 10:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1001-dev.eqiad.wmnet
* 10:56 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy3002.esams.wmnet
* 10:56 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-master-codfw
* 10:55 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4001.ulsfo.wmnet
* 10:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-worker-codfw
* 10:54 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudbackup1001-dev.eqiad.wmnet
* 10:52 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-master-eqiad
* 10:50 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-worker-eqiad
* 10:50 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 10:50 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4001.ulsfo.wmnet
* 10:50 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4002.ulsfo.wmnet
* 10:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1019.eqiad.wmnet with reason: host reimage
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1019.eqiad.wmnet with reason: host reimage
* 10:45 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4002.ulsfo.wmnet
* 10:45 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy5001.eqsin.wmnet
* 10:40 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy5001.eqsin.wmnet
* 10:39 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy5002.eqsin.wmnet
* 10:37 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul2001.codfw.wmnet
* 10:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool', diff saved to https://phabricator.wikimedia.org/P89852 and previous config saved to /var/cache/conftool/dbconfig/20260313-103719-ladsgroup.json
* 10:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul2001.codfw.wmnet
* 10:32 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy5002.eqsin.wmnet
* 10:31 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudweb2002-dev.wikimedia.org
* 10:31 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul1002.eqiad.wmnet
* 10:31 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy6001.drmrs.wmnet
* 10:29 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:29 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:28 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:28 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 10:27 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 10:27 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:27 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul1002.eqiad.wmnet
* 10:27 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:26 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy6001.drmrs.wmnet
* 10:24 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudweb2002-dev.wikimedia.org
* 10:23 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy6002.drmrs.wmnet
* 10:22 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul2002.codfw.wmnet
* 10:19 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1008.eqiad.wmnet
* 10:18 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul2002.codfw.wmnet
* 10:18 arnaudb@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host zuul2002.codfw.wmnet
* 10:18 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul2002.codfw.wmnet
* 10:18 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy6002.drmrs.wmnet
* 10:16 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy7002.magru.wmnet
* 10:16 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-worker-eqiad
* 10:15 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-worker-codfw
* 10:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1008.eqiad.wmnet
* 10:13 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1007.eqiad.wmnet
* 10:12 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy7002.magru.wmnet
* 10:09 jelto@cumin1003: conftool action : set/pooled=yes; selector: name=tcp-proxy7001.magru.wmnet
* 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1007.eqiad.wmnet
* 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1006.eqiad.wmnet
* 10:07 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy7001.magru.wmnet
* 10:03 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy7001.magru.wmnet
* 10:02 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1006.eqiad.wmnet
* 10:02 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1005.eqiad.wmnet
* 10:01 jelto@cumin1003: conftool action : set/pooled=no; selector: name=tcp-proxy7001.magru.wmnet
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 09:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1005.eqiad.wmnet
* 09:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1004.eqiad.wmnet
* 09:51 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1004.eqiad.wmnet
* 09:51 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1003.eqiad.wmnet
* 09:50 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:50 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:46 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1003.eqiad.wmnet
* 09:46 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1002.eqiad.wmnet
* 09:41 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1002.eqiad.wmnet
* 09:40 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1001.eqiad.wmnet
* 09:39 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:39 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:35 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1001.eqiad.wmnet
* 09:35 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-ctrl1002.eqiad.wmnet
* 09:34 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-ctrl1001.eqiad.wmnet
* 09:34 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:33 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:32 moritzm: installing Linux 6.1.164 on Bookworm hosts
* 09:30 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-ctrl1002.eqiad.wmnet
* 09:28 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-ctrl1001.eqiad.wmnet
* 09:01 elukey@cumin1003: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 08:37 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 07:56 moritzm: installing Linux 6.12.74 on Trixie hosts
* 07:55 moritzm: installing 6.12.74 on Trixie hosts
* 02:57 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4044.ulsfo.wmnet [reason: trixie reimaging]
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 18s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:41 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4044.ulsfo.wmnet with OS trixie
* 01:37 mutante: contint1003/contint2003 - every time(?) we setup machines with puppet using our httpd module and PHP - and puppet runs for the first time we run into the same old issue with "Exec[ensure_present_mod_php" failing and "Considering conflict mpm_worker for mpm_prefork"sudo a2dismod mpm_event". The fix is: 'sudo a2dismod mpm_event' and run puppet again. [[phab:T418521|T418521]]
* 01:26 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on contint1003.wikimedia.org with reason: [[phab:T418521|T418521]]
* 01:26 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on contint2003.wikimedia.org with reason: [[phab:T418521|T418521]]
* 01:23 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on contint2003.wikimedia.org with reason: setup
* 01:22 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on contint1003.wikimedia.org with reason: setup
* 01:22 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4047.*
* 01:09 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 01:08 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4043.ulsfo.wmnet [reason: trixie reimaging]
* 01:06 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 01:05 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4043.ulsfo.wmnet with OS trixie
* 00:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4047.ulsfo.wmnet with OS trixie
* 00:45 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4044.ulsfo.wmnet with OS trixie
* 00:45 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4044.ulsfo.wmnet [reason: trixie reimaging]
* 00:42 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4042.ulsfo.wmnet [reason: trixie reimaging]
* 00:41 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4042.ulsfo.wmnet with OS trixie
* 00:39 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4043.ulsfo.wmnet with reason: host reimage
* 00:31 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4043.ulsfo.wmnet with reason: host reimage
* 00:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4047.ulsfo.wmnet with reason: host reimage
* 00:27 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1251187 [[phab:T419637|T419637]] (duration: 07m 12s)
* 00:23 rzl@deploy2002: rzl: Continuing with sync
* 00:23 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4047.ulsfo.wmnet with reason: host reimage
* 00:22 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1251187 [[phab:T419637|T419637]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:21 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1251187 [[phab:T419637|T419637]]
* 00:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 00:14 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4040.ulsfo.wmnet [reason: trixie reimaging]
* 00:11 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 00:11 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4043.ulsfo.wmnet with OS trixie
* 00:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4040.ulsfo.wmnet with OS trixie
* 00:04 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4047.ulsfo.wmnet with OS trixie
* 00:03 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4047.ulsfo.wmnet with OS trixie
* 00:03 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4043.ulsfo.wmnet with OS trixie
== 2026-03-12 ==
* 23:57 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host o11ytest1001.eqiad.wmnet with OS trixie
* 23:53 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 23:53 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 23:50 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie
* 23:49 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 23:49 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 23:45 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 23:45 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 23:45 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 23:44 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4042.ulsfo.wmnet with OS trixie
* 23:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 23:41 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 23:41 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 23:40 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on o11ytest1001.eqiad.wmnet with reason: host reimage
* 23:36 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 23:36 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on o11ytest1001.eqiad.wmnet with reason: host reimage
* 23:36 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 23:35 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 23:35 rzl@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 23:22 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host o11ytest1001
* 23:22 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest1001
* 23:21 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie
* 23:19 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4040.ulsfo.wmnet with OS trixie
* 23:18 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest1001
* 23:18 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest1001.eqiad.wmnet 141.32.64.10.in-addr.arpa 1.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 23:18 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest1001.eqiad.wmnet 141.32.64.10.in-addr.arpa 1.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 23:18 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:18 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest1001 - herron@cumin1003"
* 23:18 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest1001 - herron@cumin1003"
* 23:04 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4047.ulsfo.wmnet with OS trixie
* 23:00 herron@cumin1003: START - Cookbook sre.dns.netbox
* 23:00 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host o11ytest1001
* 22:59 herron@cumin1003: START - Cookbook sre.hosts.reimage for host o11ytest1001.eqiad.wmnet with OS trixie
* 22:58 herron@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mwlog1002 to o11ytest1001
* 22:57 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest1001
* 22:55 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest1001
* 22:55 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest1001 on all recursors
* 22:55 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest1001 on all recursors
* 22:55 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:55 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog1002 to o11ytest1001 - herron@cumin1003"
* 22:54 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog1002 to o11ytest1001 - herron@cumin1003"
* 22:51 herron@cumin1003: START - Cookbook sre.dns.netbox
* 22:50 herron@cumin1003: START - Cookbook sre.hosts.rename from mwlog1002 to o11ytest1001
* 22:42 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4043.ulsfo.wmnet with OS trixie
* 22:42 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4043.ulsfo.wmnet [reason: trixie reimaging]
* 22:41 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4041.ulsfo.wmnet [reason: trixie reimaging]
* 22:39 bvibber@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251190{{!}}Enable ReaderExperiments Share Highlight subfeature for metrics (T416945)]], [[gerrit:1251195{{!}}Metrics module for share highlight experiment baseline (T416945)]] (duration: 06m 49s)
* 22:38 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4041.ulsfo.wmnet with OS trixie
* 22:35 bvibber@deploy2002: bvibber: Continuing with sync
* 22:34 bvibber@deploy2002: bvibber: Backport for [[gerrit:1251190{{!}}Enable ReaderExperiments Share Highlight subfeature for metrics (T416945)]], [[gerrit:1251195{{!}}Metrics module for share highlight experiment baseline (T416945)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:32 bvibber@deploy2002: Started scap sync-world: Backport for [[gerrit:1251190{{!}}Enable ReaderExperiments Share Highlight subfeature for metrics (T416945)]], [[gerrit:1251195{{!}}Metrics module for share highlight experiment baseline (T416945)]]
* 22:28 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1251182 [[phab:T419637|T419637]] (duration: 11m 18s)
* 22:27 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host o11ytest2001.codfw.wmnet with OS trixie
* 22:26 rzl@deploy2002: rzl: Continuing with sync
* 22:24 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1251182 [[phab:T419637|T419637]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:23 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie
* 22:23 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4042.ulsfo.wmnet [reason: trixie reimaging]
* 22:20 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4046.*
* 22:17 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1251182 [[phab:T419637|T419637]]
* 22:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4041.ulsfo.wmnet with reason: host reimage
* 22:09 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on o11ytest2001.codfw.wmnet with reason: host reimage
* 22:08 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4041.ulsfo.wmnet with reason: host reimage
* 22:03 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on o11ytest2001.codfw.wmnet with reason: host reimage
* 22:01 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:58 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie
* 21:58 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4040.ulsfo.wmnet [reason: trixie reimaging]
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host o11ytest2001
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest2001
* 21:45 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest2001
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest2001.codfw.wmnet 9.32.192.10.in-addr.arpa 9.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:45 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest2001.codfw.wmnet 9.32.192.10.in-addr.arpa 9.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest2001 - herron@cumin1003"
* 21:45 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest2001 - herron@cumin1003"
* 21:43 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4041.ulsfo.wmnet with OS trixie
* 21:41 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4038.ulsfo.wmnet [reason: trixie reimaging]
* 21:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4038.ulsfo.wmnet with OS trixie
* 21:39 herron@cumin1003: START - Cookbook sre.dns.netbox
* 21:39 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host o11ytest2001
* 21:39 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:39 herron@cumin1003: START - Cookbook sre.hosts.reimage for host o11ytest2001.codfw.wmnet with OS trixie
* 21:36 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:35 herron@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mwlog2002 to o11ytest2001
* 21:35 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:35 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:35 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest2001
* 21:34 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:34 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:33 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:32 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest2001
* 21:32 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest2001 on all recursors
* 21:32 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest2001 on all recursors
* 21:32 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:32 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog2002 to o11ytest2001 - herron@cumin1003"
* 21:31 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog2002 to o11ytest2001 - herron@cumin1003"
* 21:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4046.ulsfo.wmnet with OS trixie
* 21:27 herron@cumin1003: START - Cookbook sre.dns.netbox
* 21:26 herron@cumin1003: START - Cookbook sre.hosts.rename from mwlog2002 to o11ytest2001
* 21:20 rzl: rzl@apt1002:~$ sudo -i reprepro copy trixie-wikimedia bullseye-wikimedia envoyproxy
* 21:20 rzl: rzl@apt1002:~$ sudo -i reprepro copy bookworm-wikimedia bullseye-wikimedia envoyproxy
* 21:20 rzl: rzl@apt1002:~$ sudo -i reprepro -C main includedeb bullseye-wikimedia /srv/wikimedia/pool/component/envoy-future/e/envoyproxy/envoyproxy_1.35.9-1_amd64.deb
* 21:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 21:13 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251173{{!}}Revert "Move post-processing of flaggedrevs views inside FlaggablePageView"]] (duration: 07m 28s)
* 21:09 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 21:09 cscott@deploy2002: cscott: Continuing with sync
* 21:07 cscott@deploy2002: cscott: Backport for [[gerrit:1251173{{!}}Revert "Move post-processing of flaggedrevs views inside FlaggablePageView"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:05 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1251173{{!}}Revert "Move post-processing of flaggedrevs views inside FlaggablePageView"]]
* 21:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 21:02 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251152{{!}}Use 'alwaysShowLogin' query parameter during login (T419723)]], [[gerrit:1251150{{!}}login: Add 'alwaysShowLogin' login URL parameter (T419723)]], [[gerrit:1251168{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1250750{{!}}Enable parser survey for opted-out users on ru/pt/ja/id wikis (T414852)]] (duration: 10m 41s)
* 20:58 tgr@deploy2002: tgr, jsn, cscott: Continuing with sync
* 20:58 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 20:54 tgr@deploy2002: tgr, jsn, cscott: Backport for [[gerrit:1251152{{!}}Use 'alwaysShowLogin' query parameter during login (T419723)]], [[gerrit:1251150{{!}}login: Add 'alwaysShowLogin' login URL parameter (T419723)]], [[gerrit:1251168{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1250750{{!}}Enable parser survey for opted-out users on ru/pt/ja/id wikis (T414852)]] synced to the testservers (see https://wikitech
* 20:52 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1251152{{!}}Use 'alwaysShowLogin' query parameter during login (T419723)]], [[gerrit:1251150{{!}}login: Add 'alwaysShowLogin' login URL parameter (T419723)]], [[gerrit:1251168{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1250750{{!}}Enable parser survey for opted-out users on ru/pt/ja/id wikis (T414852)]]
* 20:49 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 20:43 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251087{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251088{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251106{{!}}phpunit: Avoid unnecessary writes in generatePHPUnitConfig.php (T419107)]] (duration: 07m 37s)
* 20:39 tgr@deploy2002: tgr, daimona: Continuing with sync
* 20:37 tgr@deploy2002: tgr, daimona: Backport for [[gerrit:1251087{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251088{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251106{{!}}phpunit: Avoid unnecessary writes in generatePHPUnitConfig.php (T419107)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:37 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4046.ulsfo.wmnet with OS trixie
* 20:35 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1251087{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251088{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251106{{!}}phpunit: Avoid unnecessary writes in generatePHPUnitConfig.php (T419107)]]
* 20:35 jsn@deploy2002: Synchronized wmf-config/throttle.php: (no justification provided) (duration: 01m 57s)
* 20:32 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4045.*
* 20:28 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4041.ulsfo.wmnet with OS trixie
* 20:20 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 20:18 jsn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249364{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1251140{{!}}[arwikiquote] add namespace alias for NS_PROJECT (T419828)]], [[gerrit:1251098{{!}}Deploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1251164{{!}}Increase IP cap limit for azwiki (T419899)]] (duration: 11m 11s)
* 20:14 jsn@deploy2002: jsn, dani, nmw03, gergesshamon: Continuing with sync
* 20:09 jsn@deploy2002: jsn, dani, nmw03, gergesshamon: Backport for [[gerrit:1249364{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1251140{{!}}[arwikiquote] add namespace alias for NS_PROJECT (T419828)]], [[gerrit:1251098{{!}}Deploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1251164{{!}}Increase IP cap limit for azwiki (T419899)]] synced to the testservers (see https://wikitech.wikimedia.org/wik
* 20:07 jsn@deploy2002: Started scap sync-world: Backport for [[gerrit:1249364{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1251140{{!}}[arwikiquote] add namespace alias for NS_PROJECT (T419828)]], [[gerrit:1251098{{!}}Deploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1251164{{!}}Increase IP cap limit for azwiki (T419899)]]
* 19:21 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
* 19:21 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mathoid: apply
* 19:20 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
* 19:19 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mathoid: apply
* 19:16 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 19:16 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 19:15 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 19:14 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 19:13 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 19:12 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 19:12 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 19:11 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 19:07 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4041.ulsfo.wmnet with OS trixie
* 19:06 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4041.ulsfo.wmnet [reason: trixie reimaging]
* 19:06 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4039.ulsfo.wmnet [reason: trixie reimaging]
* 19:06 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 brennen@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251138{{!}}EditPage: Re-add catch block for MWException (T419883)]] (duration: 09m 46s)
* 19:04 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:04 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:04 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:04 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:03 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:03 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:01 brennen@deploy2002: somerandomdeveloper, brennen: Continuing with sync
* 18:59 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 18:57 brennen@deploy2002: somerandomdeveloper, brennen: Backport for [[gerrit:1251138{{!}}EditPage: Re-add catch block for MWException (T419883)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:55 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4039.ulsfo.wmnet with OS trixie
* 18:55 brennen@deploy2002: Started scap sync-world: Backport for [[gerrit:1251138{{!}}EditPage: Re-add catch block for MWException (T419883)]]
* 18:52 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mathoid: apply
* 18:52 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mathoid: apply
* 18:42 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp20(2[789]{{!}}3[0-9]{{!}}40).*,service=ats-be
* 18:34 brennen@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 18:29 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4039.ulsfo.wmnet with reason: host reimage
* 18:29 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:28 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Updating dse-k8s-worker1019 - btullis@cumin1003"
* 18:26 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2332.codfw.wmnet
* 18:26 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2332.codfw.wmnet
* 18:25 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Updating dse-k8s-worker1019 - btullis@cumin1003"
* 18:24 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4039.ulsfo.wmnet with reason: host reimage
* 18:23 brennen@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251139{{!}}Ensure that we always run ParserHooks::transformHtml() when using Parsoid (T419830)]] (duration: 14m 46s)
* 18:21 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 18:20 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4038.ulsfo.wmnet with OS trixie
* 18:19 brennen@deploy2002: cscott, brennen: Continuing with sync
* 18:18 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 18:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4045.ulsfo.wmnet with OS trixie
* 18:10 brennen@deploy2002: cscott, brennen: Backport for [[gerrit:1251139{{!}}Ensure that we always run ParserHooks::transformHtml() when using Parsoid (T419830)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:10 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1019.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:08 brennen@deploy2002: Started scap sync-world: Backport for [[gerrit:1251139{{!}}Ensure that we always run ParserHooks::transformHtml() when using Parsoid (T419830)]]
* 18:02 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1019.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:02 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 17:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4039.ulsfo.wmnet with OS trixie
* 17:58 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1019
* 17:58 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1019
* 17:56 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4039.ulsfo.wmnet [reason: trixie reimaging]
* 17:55 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp20(3[6-9]{{!}}4[012]).*
* 17:54 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:53 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet [reason: trixie reimaging]
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4045.ulsfo.wmnet with reason: host reimage
* 17:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4037.ulsfo.wmnet with OS trixie
* 17:49 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4045.ulsfo.wmnet with reason: host reimage
* 17:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:33 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:31 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1018
* 17:31 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1018
* 17:30 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:28 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS trixie
* 17:28 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4045.ulsfo.wmnet with OS trixie
* 17:27 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp203[0-5].*
* 17:24 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4037.ulsfo.wmnet with reason: host reimage
* 17:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:20 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:18 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4037.ulsfo.wmnet with reason: host reimage
* 17:17 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-backup1004.eqiad.wmnet with OS trixie
* 17:17 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 17:16 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 17:06 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp202[89].*
* 17:03 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp2027.*
* 16:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 16:58 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4038.ulsfo.wmnet [reason: trixie reimaging]
* 16:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup1004.eqiad.wmnet with reason: host reimage
* 16:58 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:57 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4037.ulsfo.wmnet with OS trixie
* 16:57 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet [reason: trixie reimaging]
* 16:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup1004.eqiad.wmnet with reason: host reimage
* 16:50 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:45 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 16:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 16:44 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 16:43 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 16:43 swfrench-wmf: reprepro include dh-php_5.5+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:42 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 16:41 swfrench-wmf: reprepro include php-defaults_94+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:37 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-backup1004.eqiad.wmnet with OS trixie
* 16:36 swfrench-wmf: reprepro include php8.3_8.3.30-1+wmf11u2+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:27 dzahn@dns1004: END - running authdns-update
* 16:26 dzahn@dns1004: START - running authdns-update
* 16:25 mutante: switching old status.wikimedia.org page away from rackspace [[phab:T414098|T414098]]
* 16:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS trixie
* 16:20 dzahn@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 16:20 dzahn@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 16:19 dzahn@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 16:19 dzahn@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 16:12 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 16:11 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 16:10 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 16:09 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 16:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 16:09 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 16:08 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 16:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 16:07 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 16:06 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 16:05 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 16:04 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 16:04 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 16:04 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 16:03 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 16:02 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 16:02 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 16:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 16:01 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 15:58 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 15:57 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 15:57 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 15:56 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 15:52 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudgw2002-dev.codfw.wmnet
* 15:52 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:52 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2002-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 15:47 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2002-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 15:43 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 15:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:40 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:36 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudgw2002-dev.codfw.wmnet
* 15:35 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 15:33 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 15:27 ebernhardson@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:26 ebernhardson@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:19 moritzm: reuploadd libxml2 2.9.10+dfsg-6.7+deb11u9+wmf11u1 and 72.1-3+deb12u1~wmf11u1 to component/php83-icu72 for bullseye-wikimedia [[phab:T419058|T419058]]
* 15:14 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:13 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:13 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=tcp-proxy4004.ulsfo.wmnet
* 15:13 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=tcp-proxy4004.ulsfo.wmnet
* 15:12 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=tcp-proxy4003.ulsfo.wmnet
* 15:12 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=tcp-proxy4003.ulsfo.wmnet
* 15:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 15:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:57 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:56 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:45 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:44 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:34 andrew@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 14:33 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:31 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 14:31 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1018
* 14:31 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1018
* 14:25 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:24 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:20 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet
* 14:15 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 24 hosts with reason: Switch BGP bounce
* 14:12 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet
* 14:09 mlitn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251034{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251035{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251036{{!}}Remove queueing logic (T419587)]], [[gerrit:1251037{{!}}Remove queueing logic (T419587)]] (duration: 07m 15s)
* 14:08 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 14:07 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:07 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:05 mlitn@deploy2002: mlitn: Continuing with sync
* 14:04 mlitn@deploy2002: mlitn: Backport for [[gerrit:1251034{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251035{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251036{{!}}Remove queueing logic (T419587)]], [[gerrit:1251037{{!}}Remove queueing logic (T419587)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:03 XioNoX: start eqiad rack D2 depools
* 14:02 mlitn@deploy2002: Started scap sync-world: Backport for [[gerrit:1251034{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251035{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251036{{!}}Remove queueing logic (T419587)]], [[gerrit:1251037{{!}}Remove queueing logic (T419587)]]
* 13:59 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:59 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:57 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:54 moritzm: installing libssh security updates
* 13:54 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:45 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251031{{!}}ext.testKitchen: Depend on mediawiki.user module]], [[gerrit:1251048{{!}}Add title to the request context in FlaggedRevsCacheTest (T419539)]], [[gerrit:1251032{{!}}ext.testKitchen: Depend on mediawiki.user module]] (duration: 08m 01s)
* 13:42 phuedx@deploy2002: phuedx: Continuing with sync
* 13:39 phuedx@deploy2002: phuedx: Backport for [[gerrit:1251031{{!}}ext.testKitchen: Depend on mediawiki.user module]], [[gerrit:1251048{{!}}Add title to the request context in FlaggedRevsCacheTest (T419539)]], [[gerrit:1251032{{!}}ext.testKitchen: Depend on mediawiki.user module]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:37 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1251031{{!}}ext.testKitchen: Depend on mediawiki.user module]], [[gerrit:1251048{{!}}Add title to the request context in FlaggedRevsCacheTest (T419539)]], [[gerrit:1251032{{!}}ext.testKitchen: Depend on mediawiki.user module]]
* 13:26 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251005{{!}}Deploy EditCheck suggestion mode at all Wikipedias (T415320)]] (duration: 06m 42s)
* 13:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 esanders@deploy2002: esanders: Continuing with sync
* 13:22 esanders@deploy2002: esanders: Backport for [[gerrit:1251005{{!}}Deploy EditCheck suggestion mode at all Wikipedias (T415320)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1178.eqiad.wmnet
* 13:21 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:20 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1251005{{!}}Deploy EditCheck suggestion mode at all Wikipedias (T415320)]]
* 13:18 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250656{{!}}Add multilingual revert risk host header for LiftWing requests (T419718)]] (duration: 10m 52s)
* 13:14 fnegri@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.add-wiki (exit_code=99) for database kaiwiki ([[phab:T414240|T414240]])
* 13:14 fnegri@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database kaiwiki ([[phab:T414240|T414240]])
* 13:14 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 13:14 kgraessle@deploy2002: kgraessle: Continuing with sync
* 13:12 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 13:11 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1250656{{!}}Add multilingual revert risk host header for LiftWing requests (T419718)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 13:07 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1250656{{!}}Add multilingual revert risk host header for LiftWing requests (T419718)]]
* 13:05 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1159.eqiad.wmnet
* 13:03 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 13:02 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 13:02 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:02 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:57 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1159.eqiad.wmnet
* 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1013.eqiad.wmnet
* 12:49 dpogorzelski@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: sync
* 12:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1013.eqiad.wmnet
* 12:49 dpogorzelski@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: sync
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:31 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum4004.ulsfo.wmnet
* 12:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4004.ulsfo.wmnet
* 12:28 moritzm: installing postgresql-17 security updates
* 12:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4004.ulsfo.wmnet
* 12:14 moritzm: installing wireshark security updates
* 12:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1013.eqiad.wmnet with reason: host reimage
* 12:07 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1013.eqiad.wmnet with reason: host reimage
* 11:52 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:51 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:50 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host tcp-proxy4004.ulsfo.wmnet
* 11:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy4004.ulsfo.wmnet with OS trixie
* 11:49 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy4004.ulsfo.wmnet with reason: host reimage
* 11:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy4004.ulsfo.wmnet with reason: host reimage
* 11:19 jayme: disabled puppet on all wikikube worker nodes to rollout/test new apparmor profiles in staging - [[phab:T419781|T419781]]
* 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy4004.ulsfo.wmnet with OS trixie
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy4004.ulsfo.wmnet on all recursors
* 11:06 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy4004.ulsfo.wmnet on all recursors
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 11:03 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:00 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 10:42 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device asw1-23-ulsfo
* 10:41 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 10:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1013.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 10:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4003.ulsfo.wmnet
* 10:31 vgutierrez@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4001.ulsfo.wmnet
* 10:31 vgutierrez@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4002.ulsfo.wmnet
* 10:31 vgutierrez@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4004.ulsfo.wmnet
* 10:30 vgutierrez@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4003.ulsfo.wmnet
* 10:30 vgutierrez: repooling ncredir4003 & ncredir4004
* 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4003.ulsfo.wmnet
* 10:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy4004.ulsfo.wmnet
* 10:26 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1013.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 10:26 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:25 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1013
* 10:22 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1013
* 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host tcp-proxy4003.ulsfo.wmnet
* 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy4003.ulsfo.wmnet with OS trixie
* 10:12 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1011.eqiad.wmnet
* 10:12 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:11 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:11 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:10 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:09 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1011.eqiad.wmnet
* 10:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1010.eqiad.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy4003.ulsfo.wmnet with reason: host reimage
* 10:00 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1010.eqiad.wmnet
* 09:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy4003.ulsfo.wmnet with reason: host reimage
* 09:48 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/SERVICE_NAME: apply
* 09:48 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/SERVICE_NAME: apply
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2024.codfw.wmnet
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2023.codfw.wmnet
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2022.codfw.wmnet
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2021.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2024.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2023.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2022.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2021.codfw.wmnet
* 09:39 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 09:39 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 09:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 09:39 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 09:38 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 09:38 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 09:35 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>ms-fe[2009-2020].codfw.wmnet<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 09:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host durum4004.ulsfo.wmnet
* 09:32 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 09:32 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy4003.ulsfo.wmnet with OS trixie
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy4003.ulsfo.wmnet on all recursors
* 09:30 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy4003.ulsfo.wmnet on all recursors
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:28 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P<nowiki>{</nowiki>ms-fe[2009-2020].codfw.wmnet<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 09:28 Emperor: roll-restart codfw ms frontends prior to pooling new ones [[phab:T416243|T416243]]
* 09:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum4003.ulsfo.wmnet
* 09:23 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:23 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy4003.ulsfo.wmnet
* 09:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host durum4003.ulsfo.wmnet
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts netflow4002.ulsfo.wmnet
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:51 slyngshede@dns1004: END - running authdns-update
* 08:50 slyngshede@dns1004: START - running authdns-update
* 08:37 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:32 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts netflow4002.ulsfo.wmnet
* 08:25 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 08:23 arnaudb@dns1004: END - running authdns-update
* 08:21 arnaudb@dns1004: START - running authdns-update
* 07:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir4004.ulsfo.wmnet
* 07:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncredir4004.ulsfo.wmnet
* 07:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir4003.ulsfo.wmnet
* 07:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncredir4003.ulsfo.wmnet
* 05:24 kart_: staging: machinetranslation: Optimize model loading and memory footprints ([[phab:T411058|T411058]])
* 05:19 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
* 05:16 kartik@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
* 02:16 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2005.codfw.wmnet with OS trixie
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 14s)
* 02:03 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:59 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
* 01:52 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
* 01:49 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:47 swfrench-wmf: reprepro include php-apcu_5.1.24-1+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:37 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2005.codfw.wmnet with OS trixie
* 01:36 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2004.codfw.wmnet with OS trixie
* 01:24 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7012.*
* 01:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 01:18 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
* 01:18 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 01:15 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
* 01:13 swfrench-wmf: reprepro include dh-php_5.5+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:08 swfrench-wmf: reprepro include php-defaults_94+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:05 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 01:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 01:03 swfrench-wmf: reprepro include php8.3_8.3.30-1+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:00 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2004.codfw.wmnet with OS trixie
* 00:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7012.magru.wmnet with OS trixie
* 00:59 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
* 00:58 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
* 00:38 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 00:38 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 00:37 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 00:37 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 00:36 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 00:36 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 00:33 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 00:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7012.magru.wmnet with reason: host reimage
* 00:27 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 00:24 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7012.magru.wmnet with reason: host reimage
* 00:03 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7012.magru.wmnet with OS trixie
== 2026-03-11 ==
* 23:56 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7009.*
* 22:52 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:52 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:45 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 22:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 22:29 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 22:29 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 22:27 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7009.magru.wmnet with OS trixie
* 21:56 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 21:55 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 21:54 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250051{{!}}OrchestratorRequest: Switch evaluations to v2 endpoint (T413727)]] (duration: 18m 19s)
* 21:47 jforrester@deploy2002: jforrester: Continuing with sync
* 21:43 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7009.magru.wmnet with reason: host reimage
* 21:42 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:40 jforrester@deploy2002: jforrester: Backport for [[gerrit:1250051{{!}}OrchestratorRequest: Switch evaluations to v2 endpoint (T413727)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:39 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7009.magru.wmnet with reason: host reimage
* 21:35 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1250051{{!}}OrchestratorRequest: Switch evaluations to v2 endpoint (T413727)]]
* 21:30 rzl: rzl@apt1002:~$ sudo -i reprepro -C component/envoy-future include bullseye-wikimedia /home/rzl/envoyproxy_1.35.9-1_amd64.changes
* 21:29 arlolra@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250665{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]], [[gerrit:1250666{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]] (duration: 35m 16s)
* 21:16 arlolra@deploy2002: arlolra: Continuing with sync
* 21:15 arlolra@deploy2002: arlolra: Backport for [[gerrit:1250665{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]], [[gerrit:1250666{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:08 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7009.magru.wmnet with OS trixie
* 21:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7010.*
* 21:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7010.magru.wmnet with OS trixie
* 20:54 arlolra@deploy2002: Started scap sync-world: Backport for [[gerrit:1250665{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]], [[gerrit:1250666{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]]
* 20:47 jsn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250579{{!}}urwikisource: add logo, sitename and projectnamespace (T415974)]] (duration: 06m 55s)
* 20:43 jsn@deploy2002: anzx, jsn: Continuing with sync
* 20:42 jsn@deploy2002: anzx, jsn: Backport for [[gerrit:1250579{{!}}urwikisource: add logo, sitename and projectnamespace (T415974)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:40 jsn@deploy2002: Started scap sync-world: Backport for [[gerrit:1250579{{!}}urwikisource: add logo, sitename and projectnamespace (T415974)]]
* 20:38 jsn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250581{{!}}riskyArticleEdits: show page descriptions (T419442)]], [[gerrit:1250582{{!}}Fix Instrumentation on mobile view (T419517)]], [[gerrit:1250632{{!}}ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570)]] (duration: 10m 37s)
* 20:38 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ml-serve1014.eqiad.wmnet with reason: [[phab:T400626|T400626]]
* 20:37 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7010.magru.wmnet with reason: host reimage
* 20:34 jsn@deploy2002: jsn, sfaci: Continuing with sync
* 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search-test: apply
* 20:33 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search-test: apply
* 20:32 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7010.magru.wmnet with reason: host reimage
* 20:30 jsn@deploy2002: jsn, sfaci: Backport for [[gerrit:1250581{{!}}riskyArticleEdits: show page descriptions (T419442)]], [[gerrit:1250582{{!}}Fix Instrumentation on mobile view (T419517)]], [[gerrit:1250632{{!}}ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:28 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on gitlab1003.wikimedia.org with reason: Upgrade
* 20:28 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on gitlab2002.wikimedia.org with reason: Upgrade
* 20:27 jsn@deploy2002: Started scap sync-world: Backport for [[gerrit:1250581{{!}}riskyArticleEdits: show page descriptions (T419442)]], [[gerrit:1250582{{!}}Fix Instrumentation on mobile view (T419517)]], [[gerrit:1250632{{!}}ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570)]]
* 20:21 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:18 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 20:17 bvibber@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250647{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]], [[gerrit:1250648{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]] (duration: 06m 47s)
* 20:13 bvibber@deploy2002: bvibber: Continuing with sync
* 20:12 bvibber@deploy2002: bvibber: Backport for [[gerrit:1250647{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]], [[gerrit:1250648{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 bvibber@deploy2002: Started scap sync-world: Backport for [[gerrit:1250647{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]], [[gerrit:1250648{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]]
* 19:59 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7010.magru.wmnet with OS trixie
* 19:54 andrew@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:51 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 19:37 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-backup1004.eqiad.wmnet with OS trixie
* 19:01 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp7011.magru.wmnet
* 19:01 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7011.magru.wmnet
* 18:56 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough
* 18:49 brennen@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 18:49 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:49 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:45 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:45 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:44 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:44 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:43 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough
* 18:42 brennen: 1.46.0-wmf.19 train status: no current blockers, going ahead to group1.
* 18:39 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2332.codfw.wmnet
* 18:37 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2332.codfw.wmnet
* 18:20 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7011.*
* 18:18 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 18:16 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-backup1004.eqiad.wmnet with OS trixie
* 18:13 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 17:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1010.eqiad.wmnet with reason: host reimage
* 17:52 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:52 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:48 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:47 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1010.eqiad.wmnet with reason: host reimage
* 17:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 17:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 17:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:38 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:38 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:36 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:36 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:36 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:35 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 17:34 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 17:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
* 17:31 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
* 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:28 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:28 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:20 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:19 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:19 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:18 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:15 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:13 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:12 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:09 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:09 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:02 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7011.magru.wmnet with OS trixie
* 17:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 17:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 16:58 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum4004.ulsfo.wmnet with reason: in setup
* 16:58 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum4003.ulsfo.wmnet with reason: in setup
* 16:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 16:40 root@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:40 root@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moving many things from cloudgw2002-dev to cloudgw2004-dev - root@cumin2002"
* 16:40 root@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moving many things from cloudgw2002-dev to cloudgw2004-dev - root@cumin2002"
* 16:39 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 16:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 16:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7011.magru.wmnet with reason: host reimage
* 16:35 root@cumin2002: START - Cookbook sre.dns.netbox
* 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus4002.ulsfo.wmnet
* 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - tappof@cumin1003"
* 16:30 tappof@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - tappof@cumin1003"
* 16:30 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7011.magru.wmnet with reason: host reimage
* 16:25 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 16:23 tappof@cumin1003: START - Cookbook sre.dns.netbox
* 16:18 tappof@cumin1003: START - Cookbook sre.hosts.decommission for hosts prometheus4002.ulsfo.wmnet
* 15:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:52 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7011.magru.wmnet with OS trixie
* 15:51 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 15:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 15:50 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:49 urbanecm@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:48 sukhe: sudo cumin -b1 -s10 "C:dnsrecursor" "run-puppet-agent --enable 'merging CR 1250576'"
* 15:48 urbanecm@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:46 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:45 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 15:43 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:39 sukhe: sudo cumin "C:dnsrecursor" "disable-puppet 'merging CR 1250576'"
* 15:35 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:26 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:08 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 15:08 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 15:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:53 swfrench-wmf: updated component/php83-icu72 with libpcre2 10.42-1~wmf11+1 from apt-staging - [[phab:T419058|T419058]]
* 14:46 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:45 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum4004.ulsfo.wmnet
* 14:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum4004.ulsfo.wmnet with OS trixie
* 14:39 vgutierrez: depool ncredir4003 && ncredir4004
* 14:38 vgutierrez: repool ncredir4001 && ncredir4002
* 14:31 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4002.ulsfo.wmnet
* 14:31 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4001.ulsfo.wmnet
* 14:30 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4004.ulsfo.wmnet
* 14:30 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=ncredir4004.ulsfo.wmnet
* 14:27 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4003.ulsfo.wmnet
* 14:27 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=ncredir4003.ulsfo.wmnet
* 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4004.ulsfo.wmnet with reason: host reimage
* 14:23 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:23 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:19 moritzm: installing python-urllib3 security updates
* 14:18 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum4004.ulsfo.wmnet with reason: host reimage
* 14:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:13 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:12 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:12 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:12 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:12 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:11 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:11 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:11 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:08 gkyziridis@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:08 gkyziridis@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:07 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250568{{!}}Fix pinnableElement export (T419620)]] (duration: 06m 26s)
* 14:06 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:04 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:04 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:03 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:03 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 14:03 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:02 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1250568{{!}}Fix pinnableElement export (T419620)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:00 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1250568{{!}}Fix pinnableElement export (T419620)]]
* 13:58 moritzm: uploaded libxml2 2.9.10+dfsg-6.7+deb11u9+wmf11u1 to component/php83-icu72 for bullseye-wikimedia (special build of libxml with ICU disabled to ensure co-installabiliy between icu 67 and icu 72) [[phab:T419058|T419058]]
* 13:57 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250566{{!}}Restore advanced main menu for AMC (T413912)]] (duration: 10m 44s)
* 13:55 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum4004.ulsfo.wmnet with OS trixie
* 13:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:54 vgutierrez: repool cp7016
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum4004.ulsfo.wmnet on all recursors
* 13:54 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum4004.ulsfo.wmnet on all recursors
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:51 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 13:50 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1250566{{!}}Restore advanced main menu for AMC (T413912)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:49 vgutierrez: depool cp7016
* 13:49 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:46 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1250566{{!}}Restore advanced main menu for AMC (T413912)]]
* 13:45 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:44 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:44 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247547{{!}}Remove `MetricsPlatform` configuration from production (T416865)]] (duration: 35m 52s)
* 13:43 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 13:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4004.ulsfo.wmnet with OS bookworm
* 13:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:36 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum4004.ulsfo.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum4003.ulsfo.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum4003.ulsfo.wmnet with OS trixie
* 13:36 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 13:35 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 13:30 jdlrobson@deploy2002: jdlrobson, sfaci: Continuing with sync
* 13:29 jdlrobson@deploy2002: jdlrobson, sfaci: Backport for [[gerrit:1247547{{!}}Remove `MetricsPlatform` configuration from production (T416865)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4003.ulsfo.wmnet with reason: host reimage
* 13:18 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 13:13 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum4003.ulsfo.wmnet with reason: host reimage
* 13:08 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1247547{{!}}Remove `MetricsPlatform` configuration from production (T416865)]]
* 13:00 moritzm: installing libcommons-lang3-java security updates
* 12:57 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4004.ulsfo.wmnet with OS bookworm
* 12:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4003.ulsfo.wmnet with OS bookworm
* 12:46 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum4003.ulsfo.wmnet with OS trixie
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum4003.ulsfo.wmnet on all recursors
* 12:45 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum4003.ulsfo.wmnet on all recursors
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:41 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:37 moritzm: installing inetutils security updates
* 12:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:36 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum4003.ulsfo.wmnet
* 12:35 tappof: completed migration from prometheus4002 to prometheus4003 (ulsfo) (TT419430)
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 12:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 12:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 12:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2073.codfw.wmnet with OS bullseye
* 12:23 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 12:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet
* 12:18 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 12:17 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1011
* 12:17 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1011
* 12:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2072.codfw.wmnet with OS bullseye
* 12:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet
* 12:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2073.codfw.wmnet with reason: host reimage
* 12:04 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4003.ulsfo.wmnet with OS bookworm
* 12:01 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
* 11:59 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
* 11:58 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2073.codfw.wmnet with reason: host reimage
* 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2072.codfw.wmnet with reason: host reimage
* 11:48 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2072.codfw.wmnet with reason: host reimage
* 11:41 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1239954{{!}}[Growth] Enable on every new Wikipedia by default (T304052)]] (duration: 06m 39s)
* 11:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2073
* 11:38 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2073
* 11:37 vgutierrez: upgrading to acme-chief 0.39 on acme-chief production instances - [[phab:T419352|T419352]]
* 11:37 urbanecm@deploy2002: urbanecm: Continuing with sync
* 11:36 urbanecm@deploy2002: urbanecm: Backport for [[gerrit:1239954{{!}}[Growth] Enable on every new Wikipedia by default (T304052)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:36 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2073
* 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2073.codfw.wmnet 212.48.192.10.in-addr.arpa 2.1.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:36 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2073.codfw.wmnet 212.48.192.10.in-addr.arpa 2.1.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2073 - mvernon@cumin2002"
* 11:36 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2073 - mvernon@cumin2002"
* 11:35 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 11:34 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 11:34 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1239954{{!}}[Growth] Enable on every new Wikipedia by default (T304052)]]
* 11:34 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 11:34 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250539{{!}}[Growth] kaiwiki: Enable GrowthExperiments (T304052)]] (duration: 14m 11s)
* 11:33 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 11:33 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 11:32 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 11:32 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:31 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2073
* 11:30 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2073.codfw.wmnet with OS bullseye
* 11:30 urbanecm@deploy2002: urbanecm: Continuing with sync
* 11:29 cgoubert@dns1004: END - running authdns-update
* 11:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2072
* 11:29 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2072
* 11:28 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2072
* 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2072.codfw.wmnet 158.32.192.10.in-addr.arpa 8.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:28 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2072.codfw.wmnet 158.32.192.10.in-addr.arpa 8.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2072 - mvernon@cumin2002"
* 11:28 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2072 - mvernon@cumin2002"
* 11:28 cgoubert@dns1004: START - running authdns-update
* 11:26 urbanecm@deploy2002: mwscript-k8s job started: WikimediaMaintenance:createExtensionTables.php --wiki=kaiwiki growthexperiments # [[phab:T304052|T304052]]
* 11:24 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:24 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2072
* 11:23 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2072.codfw.wmnet with OS bullseye
* 11:22 tappof@dns1004: END - running authdns-update
* 11:22 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 11:21 urbanecm@deploy2002: urbanecm: Backport for [[gerrit:1250539{{!}}[Growth] kaiwiki: Enable GrowthExperiments (T304052)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:21 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 11:21 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
* 11:21 tappof@dns1004: START - running authdns-update
* 11:21 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
* 11:19 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1250539{{!}}[Growth] kaiwiki: Enable GrowthExperiments (T304052)]]
* 11:19 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 11:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2071.codfw.wmnet with OS bullseye
* 11:18 urbanecm@deploy2002: mwscript-k8s job started: WikimediaMaintenance:createExtensionTables.php --wiki=kaiwiki growthexperiments # [[phab:T304052|T304052]]
* 11:10 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
* 11:10 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
* 11:08 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 11:08 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 11:05 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 11:05 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 10:58 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2071.codfw.wmnet with reason: host reimage
* 10:54 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2071.codfw.wmnet with reason: host reimage
* 10:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2071
* 10:35 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2071
* 10:34 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2071
* 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2071.codfw.wmnet 221.16.192.10.in-addr.arpa 1.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:34 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2071.codfw.wmnet 221.16.192.10.in-addr.arpa 1.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2071 - mvernon@cumin2002"
* 10:34 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2071 - mvernon@cumin2002"
* 10:26 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 10:24 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2071
* 10:23 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2071.codfw.wmnet with OS bullseye
* 10:08 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2095.codfw.wmnet with OS bullseye
* 10:03 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Failed step after ml-serve1015's reimage - elukey@cumin1003"
* 10:02 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Failed step after ml-serve1015's reimage - elukey@cumin1003"
* 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1015.eqiad.wmnet with OS trixie
* 10:01 elukey@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 09:59 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2096.codfw.wmnet with OS bullseye
* 09:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2096.codfw.wmnet with OS bullseye
* 09:52 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:51 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2095.codfw.wmnet with OS bullseye
* 09:51 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:46 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 09:34 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 09:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 09:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 09:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 09:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 09:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 09:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 09:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 09:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 09:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 09:28 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 09:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 09:27 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 09:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 09:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 09:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 09:24 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 09:22 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir4004.ulsfo.wmnet
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4004.ulsfo.wmnet with OS bookworm
* 09:15 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:15 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:14 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 09:10 javiermonton@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249217{{!}}stream: mediawiki.page_html_content_change (T419258)]] (duration: 08m 28s)
* 09:07 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2096.codfw.wmnet with OS bullseye
* 09:06 javiermonton@deploy2002: javiermonton: Continuing with sync
* 09:03 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
* 09:03 javiermonton@deploy2002: javiermonton: Backport for [[gerrit:1249217{{!}}stream: mediawiki.page_html_content_change (T419258)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 09:01 javiermonton@deploy2002: Started scap sync-world: Backport for [[gerrit:1249217{{!}}stream: mediawiki.page_html_content_change (T419258)]]
* 08:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1015.eqiad.wmnet with reason: host reimage
* 08:58 trueg@deploy2002: helmfile [staging] DONE helmfile.d/services/SERVICE_NAME: apply
* 08:58 trueg@deploy2002: helmfile [staging] START helmfile.d/services/SERVICE_NAME: apply
* 08:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 08:55 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2239.codfw.wmnet with reason: mysql upgrade / restart
* 08:54 moritzm: installing imagemagick security updates
* 08:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1015.eqiad.wmnet with reason: host reimage
* 08:41 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1015.eqiad.wmnet with OS trixie
* 08:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1014.eqiad.wmnet with OS trixie
* 08:40 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:39 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:35 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4004.ulsfo.wmnet with OS bookworm
* 08:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir4004.ulsfo.wmnet on all recursors
* 08:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir4004.ulsfo.wmnet on all recursors
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1014.eqiad.wmnet with reason: host reimage
* 08:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:21 Msz2001: UTC morning backport window finished
* 08:21 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:21 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir4004.ulsfo.wmnet
* 08:21 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250426{{!}}Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages]] (duration: 10m 46s)
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir4003.ulsfo.wmnet
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4003.ulsfo.wmnet with OS bookworm
* 08:17 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1014.eqiad.wmnet with reason: host reimage
* 08:15 mszwarc@deploy2002: mszwarc: Continuing with sync
* 08:14 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1250426{{!}}Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:10 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1250426{{!}}Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages]]
* 08:09 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249921{{!}}Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422)]], [[gerrit:1250066{{!}}Send2FAWarningNotifications: Support reading users from file (T419111)]] (duration: 33m 07s)
* 08:05 moritzm: installing mariadb bugfix updates from Bookworm point release (tools and libraries as packaged in Debian, unrelated to the wmf-mariadb packages)
* 08:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1014.eqiad.wmnet with OS trixie
* 08:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 07:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 07:57 mszwarc@deploy2002: mszwarc: Continuing with sync
* 07:56 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1249921{{!}}Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422)]], [[gerrit:1250066{{!}}Send2FAWarningNotifications: Support reading users from file (T419111)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:44 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1049.eqiad.wmnet
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 07:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 07:38 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4003.ulsfo.wmnet with OS bookworm
* 07:36 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1249921{{!}}Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422)]], [[gerrit:1250066{{!}}Send2FAWarningNotifications: Support reading users from file (T419111)]]
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir4003.ulsfo.wmnet on all recursors
* 07:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir4003.ulsfo.wmnet on all recursors
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir4003.ulsfo.wmnet
* 07:22 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247639{{!}}Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727)]] (duration: 12m 24s)
* 07:18 kgraessle@deploy2002: kgraessle: Continuing with sync
* 07:12 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1247639{{!}}Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:09 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1247639{{!}}Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 59s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 00:33 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250117{{!}}Stop setting $wgImageLinksSchemaMigrationStage (T299953)]] (duration: 09m 38s)
* 00:29 zabe@deploy2002: zabe: Continuing with sync
* 00:26 zabe@deploy2002: zabe: Backport for [[gerrit:1250117{{!}}Stop setting $wgImageLinksSchemaMigrationStage (T299953)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:24 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1250117{{!}}Stop setting $wgImageLinksSchemaMigrationStage (T299953)]]
* 00:03 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 00:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint1003.wikimedia.org with OS trixie
* 00:03 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:03 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
== 2026-03-10 ==
* 23:58 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 23:53 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 23:49 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 23:44 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint1003.wikimedia.org with reason: host reimage
* 23:40 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on contint1003.wikimedia.org with reason: host reimage
* 23:31 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2096.codfw.wmnet with OS bullseye
* 23:31 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
* 23:26 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2095.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:24 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2096.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:22 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:11 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2096.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:05 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 23:05 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:59 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2095.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:39 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:38 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:51 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp7012.magru.wmnet with OS trixie
* 21:48 Dreamy_Jazz: Evening UTC backport window done
* 21:42 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7006.magru.wmnet [reason: trixie reimaging]
* 21:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7006.magru.wmnet with OS trixie
* 21:25 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1235552{{!}}Migrate EmailAuth, step 2 (T404334)]] (duration: 25m 34s)
* 21:24 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7007.magru.wmnet [reason: trixie reimaging]
* 21:22 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7007.magru.wmnet with OS trixie
* 21:21 tgr@deploy2002: tgr: Continuing with sync
* 21:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 21:09 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 21:02 tgr@deploy2002: tgr: Backport for [[gerrit:1235552{{!}}Migrate EmailAuth, step 2 (T404334)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:00 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1235552{{!}}Migrate EmailAuth, step 2 (T404334)]]
* 20:59 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7012.magru.wmnet with OS trixie
* 20:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7007.magru.wmnet with reason: host reimage
* {{safesubst:SAL entry|1=20:50 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249983{{!}}Deploy participant recruitment survey on ptwiki and trwiki (T419275)]], [[gerrit:1238733{{!}}wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402)]], [[gerrit:1238734{{!}}wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403)]], [[gerrit:1249393{{!}}build: Upgrade mediawiki-phan-config from 0.18.0 to 0.2}}
* 20:48 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7007.magru.wmnet with reason: host reimage
* 20:46 jforrester@deploy2002: dani, jforrester: Continuing with sync
* {{safesubst:SAL entry|1=20:45 jforrester@deploy2002: dani, jforrester: Backport for [[gerrit:1249983{{!}}Deploy participant recruitment survey on ptwiki and trwiki (T419275)]], [[gerrit:1238733{{!}}wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402)]], [[gerrit:1238734{{!}}wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403)]], [[gerrit:1249393{{!}}build: Upgrade mediawiki-phan-config from 0.18.0 to 0.20.0 (T41}}
* 20:43 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7006.magru.wmnet with OS trixie
* {{safesubst:SAL entry|1=20:43 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1249983{{!}}Deploy participant recruitment survey on ptwiki and trwiki (T419275)]], [[gerrit:1238733{{!}}wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402)]], [[gerrit:1238734{{!}}wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403)]], [[gerrit:1249393{{!}}build: Upgrade mediawiki-phan-config from 0.18.0 to 0.20}}
* 20:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7006.magru.wmnet with OS trixie
* 20:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - cdobbins@cumin2002"
* 20:38 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240012{{!}}Enable personal main menu to all users in Minerva Neue skin (T413912)]], [[gerrit:1250007{{!}}Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592)]], [[gerrit:1250015{{!}}Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439)]] (duration: 12m 58s)
* 20:36 cdobbins@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - cdobbins@cumin2002"
* 20:34 jforrester@deploy2002: jforrester, cscott, bwang: Continuing with sync
* 20:27 jforrester@deploy2002: jforrester, cscott, bwang: Backport for [[gerrit:1240012{{!}}Enable personal main menu to all users in Minerva Neue skin (T413912)]], [[gerrit:1250007{{!}}Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592)]], [[gerrit:1250015{{!}}Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439)]] synced to the testservers (see https://wikitech.wi
* 20:25 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1240012{{!}}Enable personal main menu to all users in Minerva Neue skin (T413912)]], [[gerrit:1250007{{!}}Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592)]], [[gerrit:1250015{{!}}Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439)]]
* 20:25 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7007.magru.wmnet with OS trixie
* 20:24 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7007.magru.wmnet [reason: trixie reimaging]
* 20:24 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7005.magru.wmnet [reason: trixie reimaging]
* 20:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7005.magru.wmnet with OS trixie
* 20:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 20:03 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7013.*
* 20:03 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 19:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7013.magru.wmnet with OS trixie
* 19:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7005.magru.wmnet with reason: host reimage
* 19:42 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7005.magru.wmnet with reason: host reimage
* 19:40 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7006.magru.wmnet with OS trixie
* 19:40 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7006.magru.wmnet [reason: trixie reimaging]
* 19:39 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7004.magru.wmnet [reason: trixie reimaging]
* 19:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7013.magru.wmnet with reason: host reimage
* 19:19 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7005.magru.wmnet with OS trixie
* 19:19 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7004.magru.wmnet with OS trixie
* 19:19 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7005.magru.wmnet [reason: trixie reimaging]
* 19:18 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7013.magru.wmnet with reason: host reimage
* 19:17 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
* 19:16 brennen@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 19:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7003.magru.wmnet with OS trixie
* 19:09 brennen: 1.46.0-wmf.19 train status: blockers believed resolved, rolling to group0
* 19:07 brennen@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249937{{!}}Re-add correct namespace for translatable pages (T419294)]] (duration: 12m 30s)
* 19:06 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 19:01 brennen@deploy2002: abi, brennen: Continuing with sync
* 18:58 brennen@deploy2002: abi, brennen: Backport for [[gerrit:1249937{{!}}Re-add correct namespace for translatable pages (T419294)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:55 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7013.magru.wmnet with OS trixie
* 18:54 brennen@deploy2002: Started scap sync-world: Backport for [[gerrit:1249937{{!}}Re-add correct namespace for translatable pages (T419294)]]
* 18:52 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7004.magru.wmnet with reason: host reimage
* 18:52 brennen@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.19 refs [[phab:T413810|T413810]] (duration: 38m 34s)
* 18:49 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7004.magru.wmnet with reason: host reimage
* 18:47 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7003.magru.wmnet with reason: host reimage
* 18:44 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7003.magru.wmnet with reason: host reimage
* 18:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7015.*
* 18:27 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7015.magru.wmnet with OS trixie
* 18:23 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7004.magru.wmnet with OS trixie
* 18:21 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7004.magru.wmnet [reason: trixie reimaging]
* 18:16 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7003.magru.wmnet with OS trixie
* 18:13 brennen@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
* 18:00 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:59 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7015.magru.wmnet with reason: host reimage
* 17:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 17:59 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 17:56 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7015.magru.wmnet with reason: host reimage
* 17:54 hashar@deploy2002: Finished deploy [integration/docroot@f544f49]: Catch up with composer/npm dev dependencies. Noop for production (duration: 00m 11s)
* 17:54 hashar@deploy2002: Started deploy [integration/docroot@f544f49]: Catch up with composer/npm dev dependencies. Noop for production
* 17:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 17:43 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 17:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:32 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:31 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:30 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7015.magru.wmnet with OS trixie
* 17:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:29 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:28 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:28 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 17:28 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:26 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:23 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 17:22 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:21 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:12 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:12 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:11 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:11 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:09 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:09 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:01 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 16:40 andrew@dns1004: END - running authdns-update
* 16:38 andrew@dns1004: START - running authdns-update
* 16:25 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249993{{!}}Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled"]] (duration: 07m 45s)
* 16:21 reedy@deploy2002: reedy: Continuing with sync
* 16:19 reedy@deploy2002: reedy: Backport for [[gerrit:1249993{{!}}Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:17 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1249993{{!}}Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled"]]
* 15:59 jynus@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 15:59 taavi: update cr firewall policy for codfw1dev ldap tree https://gerrit.wikimedia.org/r/1249985
* 15:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-fr-tech: apply
* 15:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-fr-tech: apply
* 15:55 jynus@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 15:48 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:28 brouberol@dns1004: END - running authdns-update
* 15:27 brouberol@dns1004: START - running authdns-update
* 15:10 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002"
* 15:10 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002
* 15:09 swfrench@cumin2002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002
* 15:09 swfrench@cumin2002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002"
* 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:59 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:58 sukhe: sudo cumin -b1 -s15 "C:bird" "run-puppet-agent --enable 'merging CR 1238007; add function return type'"
* 14:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:58 sukhe: sudo cumin -b1 -s15 "C:bird" "run-puppet-agent 'merging CR 1238007; add function return type'"
* 14:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:42 sukhe: sudo cumin "C:bird" "disable-puppet 'merging CR 1238007; add function return type'"
* 14:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host ml-serve1014
* 14:39 elukey@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:36 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host ml-serve1014
* 14:36 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.powercycle (exit_code=99) for host ml-serve1014
* 14:36 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host ml-serve1014
* 14:12 otto@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249367{{!}}stream: mediawiki.page_edit_type_simple.dev0 (T351225)]] (duration: 11m 05s)
* 14:08 otto@deploy2002: akhatun, otto: Continuing with sync
* 14:02 otto@deploy2002: akhatun, otto: Backport for [[gerrit:1249367{{!}}stream: mediawiki.page_edit_type_simple.dev0 (T351225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:01 otto@deploy2002: Started scap sync-world: Backport for [[gerrit:1249367{{!}}stream: mediawiki.page_edit_type_simple.dev0 (T351225)]]
* 13:49 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 13:43 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:28 vgutierrez: testing acme-chief 0.39 in acmechief-test2001 - [[phab:T419352|T419352]]
* 13:27 vgutierrez: upload acme-chief 0.39 to bookworm-wikimedia (apt.wm.o) - [[phab:T419352|T419352]]
* 13:16 jiji@cumin1003: END (FAIL) - Cookbook sre.memcached.roll-reboot-restart (exit_code=1) rolling restart_daemons on A:memcached-canary
* 13:16 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling restart_daemons on A:memcached-canary
* 13:12 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249903{{!}}Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580)]], [[gerrit:1249035{{!}}kaiwiki: add logo, stiename, projectnamespace and timezone (T414237)]] (duration: 08m 45s)
* 13:08 mszwarc@deploy2002: mszwarc, anzx: Continuing with sync
* 13:05 mszwarc@deploy2002: mszwarc, anzx: Backport for [[gerrit:1249903{{!}}Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580)]], [[gerrit:1249035{{!}}kaiwiki: add logo, stiename, projectnamespace and timezone (T414237)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1249903{{!}}Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580)]], [[gerrit:1249035{{!}}kaiwiki: add logo, stiename, projectnamespace and timezone (T414237)]]
* 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 12:57 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1015.eqiad.wmnet with OS bookworm
* 12:56 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 12:51 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1014.eqiad.wmnet with OS bookworm
* 12:50 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ml-serve1014
* 12:50 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ml-serve1014
* 12:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:49 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:49 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:47 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:45 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:44 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:42 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling restart_daemons on A:memcached-canary
* 12:42 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling restart_daemons on A:memcached-canary
* 12:31 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 12:10 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 12:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 11:59 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 11:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe2024.codfw.wmnet with OS bullseye
* 11:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1003"
* 11:17 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1003"
* 11:15 Emperor: rebalance codfw swift rings [[phab:T354872|T354872]]
* 10:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe2024.codfw.wmnet with reason: host reimage
* 10:47 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe2024.codfw.wmnet with reason: host reimage
* 10:31 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host ms-fe2024.codfw.wmnet with OS bullseye
* 10:30 ayounsi@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-fe2024.codfw.wmnet with OS bullseye
* 10:20 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host ms-fe2024.codfw.wmnet with OS bullseye
* 10:17 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cr2-eqdfw
* 09:31 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device cr2-eqdfw
* 09:22 derick@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=loginwiki --logwiki=metawiki TMPRI1975 FondueFanatic # [[phab:T419499|T419499]]
* 09:00 arnaudb@dns1005: END - running authdns-update
* 09:00 godog: restore all host interfaces - [[phab:T417393|T417393]]
* 08:58 arnaudb@dns1005: START - running authdns-update
* 08:30 godog: disabled interface for cloudcephmon1004 - [[phab:T417393|T417393]]
* 08:22 godog: disabled interfaces for cloudcephosd1021 cloudcephosd1042 cloudcephosd1043 cloudcephosd1018 cloudcephosd1022 - [[phab:T417393|T417393]]
* 08:18 godog: disabled interfaces for cloudcephosd1016 cloudcephosd1017 cloudcephosd1016 cloudcephosd1018 cloudcephosd1017 cloudcephosd1035 - [[phab:T417393|T417393]]
* 08:05 godog: start disabling cloudcephosd interfaces - [[phab:T417393|T417393]]
* 07:49 godog: prep cloudsw reboot tests 'ceph osd set noout' - [[phab:T417393|T417393]]
* 07:41 filippo@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 19 hosts with reason: switch down tests
* 06:14 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs2009.codfw.wmnet with OS bookworm
* 04:09 pt1979@cumin2002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device asw1-23-ulsfo
* 04:08 pt1979@cumin2002: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.16 (duration: 01m 48s)
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 10s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:37 ryankemper: [WDQS] [[phab:T410573|T410573]] repooled wdqs1011.eqiad.wmnet - erroneously depooled since `2025-11-19` by failed `sre.wdqs.reboot` cookbook
* 00:42 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 00:39 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
== 2026-03-09 ==
* 22:51 rzl: root@apt1002:~# reprepro --noskipold --restrict vopsbot update bookworm-wikimedia
* 22:34 bking@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM dse-k8s-ctrl1001.eqiad.wmnet
* 22:32 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1001.eqiad.wmnet
* 22:30 bking@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:29 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:03 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 22:02 alexsanford: Redeployed security fix for [[phab:T419186|T419186]]
* 21:44 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:40 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:37 cdobbins@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7002.magru.wmnet
* 21:34 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7002.magru.wmnet with OS trixie
* 21:29 alexsanford: Deployed security fix for [[phab:T419186|T419186]]
* 21:22 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 21:21 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 21:17 dani@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249370{{!}}Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]] (duration: 08m 15s)
* 21:13 dani@deploy2002: dani: Continuing with sync
* 21:11 dani@deploy2002: dani: Backport for [[gerrit:1249370{{!}}Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:09 dani@deploy2002: Started scap sync-world: Backport for [[gerrit:1249370{{!}}Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]]
* 21:08 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:05 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cp7002.magru.wmnet with reason: host reimage
* 21:02 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7002.magru.wmnet with reason: host reimage
* 21:01 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:01 tgr_: removed private code for [[phab:T397244|T397244]]
* 21:01 ryankemper: [WDQS] Alright, these are re-entering a failed state soon enough that we will need to identify the offender if we want to restore proper service. We could put some temporary hack to restart every few minutes so we at least maintain some uptime, but root cause is the usual 'we need a requestctl rule to block whoever's killing us' scenario
* 21:00 cdobbins@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7001.magru.wmnet [reason: Trixie reimaging]
* 20:57 ryankemper: [WDQS] Auto-remediation would have eventually restarted these, but some of them were staying below our current threshold of `threads > 1200`. May want to lower threshold, or examine an additional metric-type to look at in the future
* 20:56 ryankemper: [WDQS] `ryankemper@cumin2002:~$ sudo -E cumin 'A:wdqs-main AND P<nowiki>{</nowiki>wdqs1*<nowiki>}</nowiki>' 'systemctl restart wdqs-blazegraph'`
* 20:54 ryankemper: [WDQS] `ryankemper@cumin2002:~$ sudo -E cumin 'A:wdqs-main AND P<nowiki>{</nowiki>wdqs2*<nowiki>}</nowiki>' 'systemctl restart wdqs-blazegraph'`
* 20:44 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 20:43 tgr@deploy2002: Unlocked for deployment [MediaWiki]: working on private change (duration: 10m 10s)
* 20:36 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7002.magru.wmnet with OS trixie
* 20:33 tgr@deploy2002: Locking from deployment [MediaWiki]: working on private change
* 20:31 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247119{{!}}Enable parser survey for opted-out users on German/French/Polish wikis (T414852)]], [[gerrit:1249316{{!}}lift IP cap for womens month editathon (T419109)]] (duration: 13m 36s)
* 20:27 tgr@deploy2002: cscott, tgr, anzx: Continuing with sync
* 20:19 tgr@deploy2002: cscott, tgr, anzx: Backport for [[gerrit:1247119{{!}}Enable parser survey for opted-out users on German/French/Polish wikis (T414852)]], [[gerrit:1249316{{!}}lift IP cap for womens month editathon (T419109)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:17 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1247119{{!}}Enable parser survey for opted-out users on German/French/Polish wikis (T414852)]], [[gerrit:1249316{{!}}lift IP cap for womens month editathon (T419109)]]
* 20:13 aaron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249363{{!}}Remove redundant math spec file from wwwportal (T418188)]] (duration: 06m 56s)
* 20:09 aaron@deploy2002: aaron: Continuing with sync
* 20:08 aaron@deploy2002: aaron: Backport for [[gerrit:1249363{{!}}Remove redundant math spec file from wwwportal (T418188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:06 aaron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249363{{!}}Remove redundant math spec file from wwwportal (T418188)]]
* 20:01 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7016.*
* 19:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7001.magru.wmnet with OS trixie
* 19:51 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7016.magru.wmnet with OS trixie
* 19:49 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248911{{!}}Stop writing to il_to on commonswiki (T415787)]] (duration: 06m 04s)
* 19:45 zabe@deploy2002: zabe: Continuing with sync
* 19:44 zabe@deploy2002: zabe: Backport for [[gerrit:1248911{{!}}Stop writing to il_to on commonswiki (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:43 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248911{{!}}Stop writing to il_to on commonswiki (T415787)]]
* 19:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 19:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 19:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7001.magru.wmnet with reason: host reimage
* 19:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7016.magru.wmnet with reason: host reimage
* 19:23 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7001.magru.wmnet with reason: host reimage
* 19:19 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7016.magru.wmnet with reason: host reimage
* 19:15 cwhite@deploy2002: Finished deploy [performance/arc-lamp@aa8da8b]: {{Gerrit|Ie7e0355f89294a2927f9dbc28afec3a62d1752de}} (duration: 00m 08s)
* 19:15 cwhite@deploy2002: Started deploy [performance/arc-lamp@aa8da8b]: {{Gerrit|Ie7e0355f89294a2927f9dbc28afec3a62d1752de}}
* 19:14 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 19:14 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 19:05 herron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249365{{!}}udp2log: switch to new hosts]] (duration: 09m 38s)
* 19:03 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:03 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:01 herron@deploy2002: herron: Continuing with sync
* 19:00 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:00 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 18:59 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 18:59 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 18:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 18:59 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 18:57 herron@deploy2002: herron: Backport for [[gerrit:1249365{{!}}udp2log: switch to new hosts]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:57 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7001.magru.wmnet with OS trixie
* 18:55 herron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249365{{!}}udp2log: switch to new hosts]]
* 18:55 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7016.magru.wmnet with OS trixie
* 18:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 18:49 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 18:44 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 18:44 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 18:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:34 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:33 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 18:32 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 18:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:29 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:27 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:23 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 18:23 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 18:23 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 18:16 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 18:16 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 18:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1001.eqiad.wmnet
* 18:05 herron@deploy2002: Sync cancelled.
* 18:04 herron@deploy2002: herron: Backport for [[gerrit:1249361{{!}}Revert "udp2log: switch to new hosts"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:02 herron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249361{{!}}Revert "udp2log: switch to new hosts"]]
* 18:01 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1001.eqiad.wmnet
* 17:54 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:47 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:42 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:42 herron@deploy2002: Sync cancelled.
* 17:40 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:39 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:38 mutante: contint1003 - unable to get uptime Caused by: Cumin execution failed (exit_code=2) [101/240] - attempted manual powercycle - Initializing Firmware Interfaces... blank screen [[phab:T418544|T418544]]
* 17:34 mutante: contint1003.mgmt - racadm serveraction powercycle [[phab:T418544|T418544]] - not reacting
* 17:25 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:25 herron@deploy2002: herron: Backport for [[gerrit:1249332{{!}}udp2log: switch to new hosts (T417002)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:23 herron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249332{{!}}udp2log: switch to new hosts (T417002)]]
* 17:19 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host netflow4003.ulsfo.wmnet
* 17:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host netflow4003.ulsfo.wmnet with OS bookworm
* 17:13 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:08 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 17:03 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 17:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 17:00 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis kaiwiki in section s5
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow4003.ulsfo.wmnet with reason: host reimage
* 16:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on netflow4003.ulsfo.wmnet with reason: host reimage
* 16:37 moritzm: installing gnupg security updates
* 16:31 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host netflow4003.ulsfo.wmnet with OS bookworm
* 16:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netflow4003.ulsfo.wmnet on all recursors
* 16:30 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netflow4003.ulsfo.wmnet on all recursors
* 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:26 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host netflow4003.ulsfo.wmnet
* 16:26 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 15:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on prometheus4003.ulsfo.wmnet with reason: host reimage
* 15:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on prometheus4003.ulsfo.wmnet with reason: host reimage
* 15:44 vgutierrez: vgutierrez@acmechief-test2001:~$ sudo -i systemctl disable reload-acme-chief-backend.timer - [[phab:T419352|T419352]]
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 15:37 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 15:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 15:30 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 15:26 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 15:24 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 15:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
* 15:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
* 15:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 15:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
* 15:08 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 15:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
* 14:50 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bookworm
* 14:49 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs2009.codfw.wmnet with OS bullseye
* 14:45 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 14:35 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249291{{!}}Hide 2fa-warning Echo category from preferences (T419111)]] (duration: 06m 07s)
* 14:35 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis kaiwiki in section s5
* 14:34 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitize-wiki (exit_code=99) Managing sanitization for wikis urwikisource in section s5
* 14:31 mszwarc@deploy2002: mszwarc: Continuing with sync
* 14:31 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1249291{{!}}Hide 2fa-warning Echo category from preferences (T419111)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:30 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
* 14:29 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1249291{{!}}Hide 2fa-warning Echo category from preferences (T419111)]]
* 14:25 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis urwikisource in section s5
* 14:22 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis urwikisource in section s5
* 14:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 14:15 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 14:15 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249243{{!}}JS SDK: Add getExperimentByPrefix() (T419191)]], [[gerrit:1249242{{!}}ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191)]] (duration: 09m 39s)
* 14:11 phuedx@deploy2002: phuedx: Continuing with sync
* 14:07 phuedx@deploy2002: phuedx: Backport for [[gerrit:1249243{{!}}JS SDK: Add getExperimentByPrefix() (T419191)]], [[gerrit:1249242{{!}}ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:05 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1249243{{!}}JS SDK: Add getExperimentByPrefix() (T419191)]], [[gerrit:1249242{{!}}ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191)]]
* 14:03 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 13:54 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bullseye
* 13:50 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249262{{!}}Disable MetricsPlatform extension (T416865)]] (duration: 08m 02s)
* 13:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 13:46 phuedx@deploy2002: phuedx, sfaci: Continuing with sync
* 13:44 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:44 phuedx@deploy2002: phuedx, sfaci: Backport for [[gerrit:1249262{{!}}Disable MetricsPlatform extension (T416865)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1249262{{!}}Disable MetricsPlatform extension (T416865)]]
* 13:39 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248075{{!}}Confirmemail: Log delay between email sent and confirmation (T415902)]], [[gerrit:1247651{{!}}Enable confirmemail logstash channel (T415902)]] (duration: 11m 16s)
* 13:35 phuedx@deploy2002: mmartorana, phuedx: Continuing with sync
* 13:30 phuedx@deploy2002: mmartorana, phuedx: Backport for [[gerrit:1248075{{!}}Confirmemail: Log delay between email sent and confirmation (T415902)]], [[gerrit:1247651{{!}}Enable confirmemail logstash channel (T415902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1248075{{!}}Confirmemail: Log delay between email sent and confirmation (T415902)]], [[gerrit:1247651{{!}}Enable confirmemail logstash channel (T415902)]]
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 13:04 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 12:55 moritzm: installing Kerberos security updates
* 12:29 moritzm: installing python3.9 security updates
* 12:11 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 12:00 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1239026{{!}}Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544)]], [[gerrit:1249253{{!}}CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled]] (duration: 06m 13s)
* 11:56 reedy@deploy2002: reedy: Continuing with sync
* 11:56 reedy@deploy2002: reedy: Backport for [[gerrit:1239026{{!}}Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544)]], [[gerrit:1249253{{!}}CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:54 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1239026{{!}}Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544)]], [[gerrit:1249253{{!}}CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled]]
* 11:44 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249245{{!}}Hooks: Really only add global logging context for pageviews]] (duration: 12m 02s)
* 11:38 phuedx@deploy2002: phuedx: Continuing with sync
* 11:34 phuedx@deploy2002: phuedx: Backport for [[gerrit:1249245{{!}}Hooks: Really only add global logging context for pageviews]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:32 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1249245{{!}}Hooks: Really only add global logging context for pageviews]]
* 11:29 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 11:29 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 11:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 11:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:50 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:49 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:45 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus4003.ulsfo.wmnet
* 10:45 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:45 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:44 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus4003.ulsfo.wmnet on all recursors
* 10:43 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache prometheus4003.ulsfo.wmnet on all recursors
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:40 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:39 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:39 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host prometheus4003.ulsfo.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:33 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:17 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 10:12 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:51 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 09:46 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:43 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:40 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus4003.ulsfo.wmnet
* 09:40 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host prometheus4003.ulsfo.wmnet
* 09:31 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host frdb1008
* 09:31 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host frdb1008
* 09:29 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 09:05 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 08:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 08:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 08:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 08:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:25 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 08:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 08:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 08:21 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 08:16 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:07 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo and group 1
* 08:07 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo and group 1
* 07:37 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248806{{!}}Add a script to send mandatory 2FA Echo notification (T419111)]], [[gerrit:1248821{{!}}Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880)]] (duration: 34m 41s)
* 07:23 mszwarc@deploy2002: mszwarc: Continuing with sync
* 07:22 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1248806{{!}}Add a script to send mandatory 2FA Echo notification (T419111)]], [[gerrit:1248821{{!}}Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:02 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1248806{{!}}Add a script to send mandatory 2FA Echo notification (T419111)]], [[gerrit:1248821{{!}}Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 58s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-08 ==
* 20:28 vgutierrez@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on acmechief-test2001.codfw.wmnet with reason: GTS issues
* 02:01 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 00m 59s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-07 ==
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 23s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:20 krinkle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248952{{!}}CSP: restore toolforge/wmcs entry in false positive list]] (duration: 10m 46s)
* 01:16 krinkle@deploy2002: krinkle: Continuing with sync
* 01:11 krinkle@deploy2002: krinkle: Backport for [[gerrit:1248952{{!}}CSP: restore toolforge/wmcs entry in false positive list]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:09 krinkle@deploy2002: Started scap sync-world: Backport for [[gerrit:1248952{{!}}CSP: restore toolforge/wmcs entry in false positive list]]
* 00:22 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
* 00:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp2043.codfw.wmnet
* 00:05 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp2043.codfw.wmnet
== 2026-03-06 ==
* 23:29 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs2009.codfw.wmnet with OS bullseye
* 23:13 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 23:07 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs2009
* 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2009
* 22:46 ryankemper@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2009
* 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs2009.codfw.wmnet 141.0.192.10.in-addr.arpa 1.4.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:46 ryankemper@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs2009.codfw.wmnet 141.0.192.10.in-addr.arpa 1.4.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:45 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:45 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs2009 - ryankemper@cumin2002"
* 22:45 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs2009 - ryankemper@cumin2002"
* 22:41 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
* 22:40 ryankemper@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs2009
* 22:39 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bullseye
* 19:48 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:47 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:47 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:46 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host wdqs2009.codfw.wmnet
* 19:23 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2009.codfw.wmnet
* 19:17 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on wdqs2009.codfw.wmnet with reason: NFS might be hung, about to reboot
* 18:56 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2043.codfw.wmnet with reason: troubleshooting for network drops
* 18:44 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp2043.*
* 18:29 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts an-backup-datanode1033.eqiad.wmnet
* 18:29 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-backup-datanode1033.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
* 18:28 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-backup-datanode1033.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
* 17:59 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248858{{!}}cirrus: Use https for semanticsearch-test cluster]] (duration: 11m 20s)
* 17:53 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 17:52 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1248858{{!}}cirrus: Use https for semanticsearch-test cluster]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:51 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:51 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:47 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1248858{{!}}cirrus: Use https for semanticsearch-test cluster]]
* 17:42 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:42 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:40 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:40 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:11 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:11 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:10 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 17:05 hashar@deploy2002: Finished deploy [gerrit/gerrit@b8183ba]: wm-checks-api: add tooltip to the CheckRun Run action (duration: 00m 13s)
* 17:05 hashar@deploy2002: Started deploy [gerrit/gerrit@b8183ba]: wm-checks-api: add tooltip to the CheckRun Run action
* 17:04 btullis@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-backup-datanode1033.eqiad.wmnet
* 16:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 16:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 16:23 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 16:23 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 15:57 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:57 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:56 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:56 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:52 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2354-2356].codfw.wmnet
* 15:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2354-2356].codfw.wmnet
* 15:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2356.codfw.wmnet with OS trixie
* 15:46 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2355.codfw.wmnet with OS trixie
* 15:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2354.codfw.wmnet with OS trixie
* 15:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2356.codfw.wmnet with reason: host reimage
* 15:31 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 15:30 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 15:28 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 15:28 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 15:28 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 15:26 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 15:26 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 15:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2355.codfw.wmnet with reason: host reimage
* 15:24 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 15:23 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 15:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2354.codfw.wmnet with reason: host reimage
* 15:19 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 15:19 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 15:17 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 15:17 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 15:17 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2356.codfw.wmnet with reason: host reimage
* 15:16 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2355.codfw.wmnet with reason: host reimage
* 15:16 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2354.codfw.wmnet with reason: host reimage
* 15:15 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 15:10 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 15:09 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 15:08 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 15:08 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 15:06 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 15:05 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 15:05 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 15:05 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2356.codfw.wmnet with OS trixie
* 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2355.codfw.wmnet with OS trixie
* 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2354.codfw.wmnet with OS trixie
* 15:02 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 15:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2348-2353].codfw.wmnet
* 15:02 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 15:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2348-2353].codfw.wmnet
* 14:59 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2353.codfw.wmnet with OS trixie
* 14:57 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:57 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:56 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
* 14:53 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:52 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 14:52 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2351.codfw.wmnet with OS trixie
* 14:49 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 14:48 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2352.codfw.wmnet with OS trixie
* 14:48 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 14:48 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 14:48 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:47 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 14:45 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2350.codfw.wmnet with OS trixie
* 14:44 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 14:43 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:43 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 14:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:41 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2353.codfw.wmnet with reason: host reimage
* 14:37 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2349.codfw.wmnet with reason: host reimage
* 14:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2351.codfw.wmnet with reason: host reimage
* 14:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2352.codfw.wmnet with reason: host reimage
* 14:29 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:28 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2350.codfw.wmnet with reason: host reimage
* 14:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2348.codfw.wmnet with reason: host reimage
* 14:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2351.codfw.wmnet with reason: host reimage
* 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2352.codfw.wmnet with reason: host reimage
* 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2353.codfw.wmnet with reason: host reimage
* 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2350.codfw.wmnet with reason: host reimage
* 14:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2349.codfw.wmnet with reason: host reimage
* 14:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2348.codfw.wmnet with reason: host reimage
* 14:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2353.codfw.wmnet with OS trixie
* 14:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2352.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2351.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2350.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2347].codfw.wmnet
* 14:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2347].codfw.wmnet
* 14:01 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2347.codfw.wmnet with OS trixie
* 13:57 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2346.codfw.wmnet with OS trixie
* 13:55 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2343.codfw.wmnet with OS trixie
* 13:50 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2345.codfw.wmnet with OS trixie
* 13:48 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2344.codfw.wmnet with OS trixie
* 13:45 dreamyjazz@deploy2002: mwscript-k8s job started: foreachwikiindblist checkuser-suggested-investigations CheckUser:queueAutoCloseSICases.php # [[phab:T418591|T418591]]
* 13:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2342.codfw.wmnet with OS trixie
* 13:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2347.codfw.wmnet with reason: host reimage
* 13:38 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2346.codfw.wmnet with reason: host reimage
* 13:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2343.codfw.wmnet with reason: host reimage
* 13:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2345.codfw.wmnet with reason: host reimage
* 13:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2344.codfw.wmnet with reason: host reimage
* 13:24 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2342.codfw.wmnet with reason: host reimage
* 13:21 Dreamy_Jazz: Running foreachwikiindblist checkuser-suggested-investigations.dblist ~/PopulateSiuInfo.php --batch-size=1000 for [[phab:T411118|T411118]]
* 13:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2347.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2346.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2345.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2344.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2343.codfw.wmnet with reason: host reimage
* 13:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2342.codfw.wmnet with reason: host reimage
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2347.codfw.wmnet with OS trixie
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2346.codfw.wmnet with OS trixie
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2345.codfw.wmnet with OS trixie
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2344.codfw.wmnet with OS trixie
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2343.codfw.wmnet with OS trixie
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2342.codfw.wmnet with OS trixie
* 13:05 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2336-2341].codfw.wmnet
* 13:05 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2336-2341].codfw.wmnet
* 13:01 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2341.codfw.wmnet with OS trixie
* 12:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2340.codfw.wmnet with OS trixie
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2337.codfw.wmnet with OS trixie
* 12:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2338.codfw.wmnet with OS trixie
* 12:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2336.codfw.wmnet with OS trixie
* 12:40 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2341.codfw.wmnet with reason: host reimage
* 12:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2339.codfw.wmnet with OS trixie
* 12:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2340.codfw.wmnet with reason: host reimage
* 12:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2337.codfw.wmnet with reason: host reimage
* 12:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2338.codfw.wmnet with reason: host reimage
* 12:22 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2336.codfw.wmnet with reason: host reimage
* 12:18 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2339.codfw.wmnet with reason: host reimage
* 12:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2340.codfw.wmnet with reason: host reimage
* 12:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2341.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2337.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2338.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2336.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2339.codfw.wmnet with reason: host reimage
* 12:00 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2341.codfw.wmnet with OS trixie
* 12:00 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2340.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2339.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2338.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2337.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2336.codfw.wmnet with OS trixie
* 11:56 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2333-2335].codfw.wmnet
* 11:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2333-2335].codfw.wmnet
* 11:55 elukey@cumin1003: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 11:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1207.eqiad.wmnet
* 11:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2335.codfw.wmnet with OS trixie
* 11:53 moritzm: uploaded icu 72.1-3+deb12u1~wmf11u1 to component/php83-icu72 [[phab:T419058|T419058]] (backport of ICU 72 from Bookworm to Bullseye, built to be co-installable with the native ICU from Bullseye)
* 11:50 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2334.codfw.wmnet with OS trixie
* 11:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1207.eqiad.wmnet
* 11:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1205.eqiad.wmnet
* 11:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2333.codfw.wmnet with OS trixie
* 11:39 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 11:39 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1205.eqiad.wmnet
* 11:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2335.codfw.wmnet with reason: host reimage
* 11:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2334.codfw.wmnet with reason: host reimage
* 11:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2333.codfw.wmnet with reason: host reimage
* 11:23 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2335.codfw.wmnet with reason: host reimage
* 11:22 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2334.codfw.wmnet with reason: host reimage
* 11:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2333.codfw.wmnet with reason: host reimage
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2334.codfw.wmnet with OS trixie
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2335.codfw.wmnet with OS trixie
* 11:08 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2333.codfw.wmnet with OS trixie
* 11:06 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2332.codfw.wmnet
* 11:05 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2332.codfw.wmnet
* 11:02 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2332.codfw.wmnet with OS trixie
* 10:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2332.codfw.wmnet with reason: host reimage
* 10:36 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2332.codfw.wmnet with reason: host reimage
* 10:23 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2332.codfw.wmnet with OS trixie
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1199.eqiad.wmnet
* 10:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1194.eqiad.wmnet
* 10:16 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2356].codfw.wmnet
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1194.eqiad.wmnet
* 10:09 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
* 10:09 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2356].codfw.wmnet
* 09:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:39 Emperor: repool ms-fe1013 after PXE work [[phab:T401966|T401966]]
* 09:23 derick@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=pmswiki --logwiki=metawiki Wikilimes Limes.pink # [[phab:T419184|T419184]]
* 09:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:09 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:08 elukey@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 09:08 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe1013.eqiad.wmnet
* 08:57 elukey@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-fe1013.eqiad.wmnet
* 08:56 elukey@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 08:54 elukey@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 08:42 elukey@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 08:25 moritzm: uploaded openjdk-8 8u482-ga-1~deb12u1 to component/jdk8 of bookworm-wikimedia
* 08:11 moritzm: imported prometheus-ganeti-exporter 0.3+deb12u2 for bookworm-wikimedia [[phab:T419166|T419166]]
* 06:23 ryankemper@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
* 06:23 ryankemper@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
* 06:23 ryankemper@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
* 06:23 ryankemper@deploy2002: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
* 06:22 ryankemper@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
* 06:22 ryankemper@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
* 02:59 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:59 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 02:59 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 02:56 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 02:21 zabe: zabe@deploy2002:/srv/mediawiki-staging$ foreachwiki extensions/TimedMediaHandler/maintenance/migrateTranscodeStates.php --force # [[phab:T415064|T415064]]
* 02:16 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248658{{!}}Update interwiki cache]] (duration: 06m 38s)
* 02:12 zabe@deploy2002: mwscript-k8s job started: foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https # [[phab:T415978|T415978]], [[phab:T414241|T414241]]
* 02:12 zabe@deploy2002: zabe: Continuing with sync
* 02:11 zabe@deploy2002: zabe: Backport for [[gerrit:1248658{{!}}Update interwiki cache]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 02:09 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248658{{!}}Update interwiki cache]]
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 23s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:59 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248656{{!}}Set urwikisource to rtl (T415960)]] (duration: 06m 39s)
* 01:55 zabe@deploy2002: zabe: Continuing with sync
* 01:54 zabe@deploy2002: zabe: Backport for [[gerrit:1248656{{!}}Set urwikisource to rtl (T415960)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:53 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248656{{!}}Set urwikisource to rtl (T415960)]]
* 01:45 zabe@deploy2002: Sync cancelled.
* 01:43 zabe@deploy2002: zabe: Backport for [[gerrit:1248653{{!}}Activate urwikisource (T415960)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:42 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248653{{!}}Activate urwikisource (T415960)]]
* 01:38 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248652{{!}}Prepare urwikisource (T415960)]] (duration: 06m 18s)
* 01:34 zabe@deploy2002: zabe: Continuing with sync
* 01:34 zabe@deploy2002: zabe: Backport for [[gerrit:1248652{{!}}Prepare urwikisource (T415960)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:32 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248652{{!}}Prepare urwikisource (T415960)]]
* 01:29 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248650{{!}}Activate kaiwiki (T414234)]] (duration: 06m 57s)
* 01:25 zabe@deploy2002: zabe: Continuing with sync
* 01:24 zabe@deploy2002: zabe: Backport for [[gerrit:1248650{{!}}Activate kaiwiki (T414234)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:22 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248650{{!}}Activate kaiwiki (T414234)]]
* 01:17 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248647{{!}}Prepare kaiwiki (T414234)]] (duration: 07m 25s)
* 01:13 zabe@deploy2002: zabe: Continuing with sync
* 01:11 zabe@deploy2002: zabe: Backport for [[gerrit:1248647{{!}}Prepare kaiwiki (T414234)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:09 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248647{{!}}Prepare kaiwiki (T414234)]]
* 00:33 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248493{{!}}Stop writing to il_to on all wikis except commons (T415787)]] (duration: 06m 22s)
* 00:29 zabe@deploy2002: zabe: Continuing with sync
* 00:28 zabe@deploy2002: zabe: Backport for [[gerrit:1248493{{!}}Stop writing to il_to on all wikis except commons (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:27 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248493{{!}}Stop writing to il_to on all wikis except commons (T415787)]]
* 00:05 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248628{{!}}Re-enable AllowUserJs (T419137)]] (duration: 08m 08s)
* 00:01 catrope@deploy2002: catrope, kharlan: Continuing with sync
== 2026-03-05 ==
* 23:58 catrope@deploy2002: catrope, kharlan: Backport for [[gerrit:1248628{{!}}Re-enable AllowUserJs (T419137)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:56 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1248628{{!}}Re-enable AllowUserJs (T419137)]]
* 23:52 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248636{{!}}CSP: Update false positives list]] (duration: 06m 34s)
* 23:52 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2003.wikimedia.org with OS trixie
* 23:47 catrope@deploy2002: catrope: Continuing with sync
* 23:47 catrope@deploy2002: catrope: Backport for [[gerrit:1248636{{!}}CSP: Update false positives list]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:45 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1248636{{!}}CSP: Update false positives list]]
* 23:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint2003.wikimedia.org with reason: host reimage
* 23:29 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on contint2003.wikimedia.org with reason: host reimage
* 23:15 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1238028{{!}}Using Hadoop for MostTranscludedPages on commonswiki (T416927)]] (duration: 06m 27s)
* 23:11 zabe@deploy2002: zabe: Continuing with sync
* 23:10 zabe@deploy2002: zabe: Backport for [[gerrit:1238028{{!}}Using Hadoop for MostTranscludedPages on commonswiki (T416927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:09 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2003.wikimedia.org with OS trixie
* 23:08 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1238028{{!}}Using Hadoop for MostTranscludedPages on commonswiki (T416927)]]
* 22:45 maryum: Deployed security fix for [[phab:T418254|T418254]]
* 22:35 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248483{{!}}SpecialWantedFiles: Use lt_title instead of lt_to (T299953)]] (duration: 06m 12s)
* 22:31 zabe@deploy2002: zabe: Continuing with sync
* 22:30 zabe@deploy2002: zabe: Backport for [[gerrit:1248483{{!}}SpecialWantedFiles: Use lt_title instead of lt_to (T299953)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:28 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248483{{!}}SpecialWantedFiles: Use lt_title instead of lt_to (T299953)]]
* 21:43 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248508{{!}}cirrus: Align semanticsearch cluster group name with routing (T413969)]] (duration: 07m 20s)
* 21:39 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 21:38 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1248508{{!}}cirrus: Align semanticsearch cluster group name with routing (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:36 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1248508{{!}}cirrus: Align semanticsearch cluster group name with routing (T413969)]]
* 21:04 jhathaway@dns1004: END - running authdns-update
* 21:02 jhathaway@dns1004: START - running authdns-update
* 20:53 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:52 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new service IPs for sophroid - jasmine@cumin2002"
* 20:52 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new service IPs for sophroid - jasmine@cumin2002"
* 20:47 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 20:28 cdanis: apt built and imported jwt-authorizer 1.3.0-1
* 20:16 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 20:04 krinkle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248574{{!}}Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475)]] (duration: 07m 37s)
* 20:00 krinkle@deploy2002: krinkle: Continuing with sync
* 19:58 krinkle@deploy2002: krinkle: Backport for [[gerrit:1248574{{!}}Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:56 krinkle@deploy2002: Started scap sync-world: Backport for [[gerrit:1248574{{!}}Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475)]]
* 19:21 sbassett@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248571{{!}}Re-enable Site JS (T419137 T419138)]] (duration: 06m 57s)
* 19:17 sbassett@deploy2002: sbassett: Continuing with sync
* 19:16 sbassett@deploy2002: sbassett: Backport for [[gerrit:1248571{{!}}Re-enable Site JS (T419137 T419138)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:15 sbassett@deploy2002: Started scap sync-world: Backport for [[gerrit:1248571{{!}}Re-enable Site JS (T419137 T419138)]]
* 19:04 dr0ptp4kt: Deploying change {{Gerrit|1239200}} for refinery ( [[phab:T416481|T416481]] ) using scap, then deployed onto hdfs
* 19:03 dr0ptp4kt: Deployed refinery change {{Gerrit|1240253}} ( [[phab:T414478|T414478]] ), {{Gerrit|1240253}} (no-op) for refinery ( [[phab:T414478|T414478]] ) using scap, then deployed onto hdfs
* 18:58 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1] (thin): Regular analytics weekly train THIN [analytics/refinery@dd641b15] (duration: 02m 02s)
* 18:56 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1] (thin): Regular analytics weekly train THIN [analytics/refinery@dd641b15]
* 18:55 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1]: Regular analytics weekly train [analytics/refinery@dd641b15] (duration: 04m 18s)
* 18:50 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1]: Regular analytics weekly train [analytics/refinery@dd641b15]
* 18:49 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@dd641b15] (duration: 01m 57s)
* 18:47 dr0ptp4kt: Deploying change {{Gerrit|1239200}} for refinery ( [[phab:T416481|T416481]] )
* 18:47 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@dd641b15]
* 18:31 eevans@dns1004: END - running authdns-update
* 18:30 eevans@dns1004: START - running authdns-update
* 18:30 sukhe: sudo cumin -b51 "A:cp" "run-puppet-agent --enable 'rolling out 1248544'"
* 18:16 sukhe: sudo cumin "A:cp" "disable-puppet 'rolling out 1248544'"
* 18:06 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:06 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 18:06 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 18:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 17:31 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248536{{!}}Enable wgUseSiteJs on donatewiki (T419138)]] (duration: 09m 57s)
* 17:27 mszwarc@deploy2002: mszwarc, krinkle: Continuing with sync
* 17:26 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2003.wikimedia.org with OS bookworm
* 17:26 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:23 mszwarc@deploy2002: mszwarc, krinkle: Backport for [[gerrit:1248536{{!}}Enable wgUseSiteJs on donatewiki (T419138)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:21 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1248536{{!}}Enable wgUseSiteJs on donatewiki (T419138)]]
* 17:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1178.eqiad.wmnet
* 17:16 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:12 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1162.eqiad.wmnet
* 17:12 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1162.eqiad.wmnet
* 17:10 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker1162.eqiad.wmnet
* 17:10 cgoubert@cumin1003: START - Cookbook sre.hosts.remove-downtime for wikikube-worker1162.eqiad.wmnet
* 17:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 17:05 taavi@cumin1003: dbctl commit (dc=all): 'enable writes', diff saved to https://phabricator.wikimedia.org/P89812 and previous config saved to /var/cache/conftool/dbconfig/20260305-170556-taavi.json
* 16:03 oblivian@cumin1003: dbctl commit (dc=all): 'read only s6', diff saved to https://phabricator.wikimedia.org/P89810 and previous config saved to /var/cache/conftool/dbconfig/20260305-160348-oblivian.json
* 15:32 taavi@cumin1003: dbctl commit (dc=all): 'set global ro', diff saved to https://phabricator.wikimedia.org/P89808 and previous config saved to /var/cache/conftool/dbconfig/20260305-153203-taavi.json
* 15:31 mszwarc@deploy2002: mszwarc: Continuing with sync
* 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1178.eqiad.wmnet
* 15:31 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1248509{{!}}Disable custom JS for a moment]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:29 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1248509{{!}}Disable custom JS for a moment]]
* 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2003']
* 15:25 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2003']
* 15:23 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248506{{!}}cirrus: Correct semantic builder config (T413969)]] (duration: 07m 39s)
* 15:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:19 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 15:18 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1248506{{!}}cirrus: Correct semantic builder config (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:16 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1248506{{!}}cirrus: Correct semantic builder config (T413969)]]
* 15:11 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 15:10 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244713{{!}}cirrus: Add semantic search test cluster (T413969)]] (duration: 09m 18s)
* 15:06 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 15:04 sukhe@dns1004: END - running authdns-update
* 15:03 sukhe@dns1004: START - running authdns-update
* 15:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:02 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1244713{{!}}cirrus: Add semantic search test cluster (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:02 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
* 15:02 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
* 15:00 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1244713{{!}}cirrus: Add semantic search test cluster (T413969)]]
* 14:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:53 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 14:50 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 14:38 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
* 14:38 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
* 14:32 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
* 14:32 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
* 14:32 sukhe@dns1004: END - running authdns-update
* 14:30 sukhe@dns1004: START - running authdns-update
* 14:28 elukey@cumin1003: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 14:28 sukhe@dns1004: START - running authdns-update
* 14:27 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1231.eqiad.wmnet
* 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1230.eqiad.wmnet
* 14:24 bking@dns1004: START - running authdns-update
* 14:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1230.eqiad.wmnet
* 14:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1229.eqiad.wmnet
* 14:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 14:05 moritzm: imported nodejs 24.14.0-1nodesource1 to thirdparty/node24 [[phab:T418440|T418440]]
* 14:03 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1229.eqiad.wmnet
* 14:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1228.eqiad.wmnet
* 14:01 moritzm: initialised ganeti02/ulsfo cluster [[phab:T418993|T418993]]
* 13:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:53 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:52 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1228.eqiad.wmnet
* 13:52 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1227.eqiad.wmnet
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:46 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:43 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:42 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1199.eqiad.wmnet
* 13:40 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1227.eqiad.wmnet
* 13:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1226.eqiad.wmnet
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:35 moritzm: installing glib2.0 security updates
* 13:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:26 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1226.eqiad.wmnet
* 13:26 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1225.eqiad.wmnet
* 13:26 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1225.eqiad.wmnet
* 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1224.eqiad.wmnet
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4006.ulsfo.wmnet with OS bookworm
* 13:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:07 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new VIP for routed ganeti in ulsfo - jmm@cumin2002"
* 13:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new VIP for routed ganeti in ulsfo - jmm@cumin2002"
* 13:06 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:02 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1224.eqiad.wmnet
* 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1223.eqiad.wmnet
* 13:00 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:58 cgoubert@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on wikikube-worker1162.eqiad.wmnet with reason: dcops intervention
* 12:57 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1162.eqiad.wmnet
* 12:56 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1162.eqiad.wmnet
* 12:55 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 12:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1223.eqiad.wmnet
* 12:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1222.eqiad.wmnet
* 12:46 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 12:43 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 12:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1222.eqiad.wmnet
* 12:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1221.eqiad.wmnet
* 12:23 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1221.eqiad.wmnet
* 12:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1220.eqiad.wmnet
* 12:23 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4006.ulsfo.wmnet with OS bookworm
* 12:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1220.eqiad.wmnet
* 11:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4005.ulsfo.wmnet
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4005.ulsfo.wmnet
* 11:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1236.eqiad.wmnet
* 11:29 moritzm: remove ganeti4006 from ganeti/ulsfo cluster [[phab:T418993|T418993]]
* 11:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1236.eqiad.wmnet
* 11:25 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1235.eqiad.wmnet
* 11:16 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1235.eqiad.wmnet
* 11:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1234.eqiad.wmnet
* 11:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1234.eqiad.wmnet
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1233.eqiad.wmnet
* 11:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1233.eqiad.wmnet
* 11:02 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1232.eqiad.wmnet
* 11:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 11:00 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4005.ulsfo.wmnet with OS bookworm
* 10:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 10:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1232.eqiad.wmnet
* 10:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1231.eqiad.wmnet
* 10:54 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 10:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1231.eqiad.wmnet
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1230.eqiad.wmnet
* 10:41 elukey@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4005.ulsfo.wmnet with reason: host reimage
* 10:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1230.eqiad.wmnet
* 10:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1229.eqiad.wmnet
* 10:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4005.ulsfo.wmnet with reason: host reimage
* 10:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1229.eqiad.wmnet
* 10:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1228.eqiad.wmnet
* 10:24 moritzm: installing Java 8 security updates
* 10:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1228.eqiad.wmnet
* 10:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1227.eqiad.wmnet
* 10:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1227.eqiad.wmnet
* 10:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1226.eqiad.wmnet
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 10:11 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4005.ulsfo.wmnet with OS bookworm
* 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ganeti4005.ulsfo.wmnet
* 10:08 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ganeti4005.ulsfo.wmnet
* 10:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add gw-virtual.ulsfo.wmnet - ayounsi@cumin1003"
* 10:07 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1226.eqiad.wmnet
* 10:07 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1225.eqiad.wmnet
* 09:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1225.eqiad.wmnet
* 09:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1224.eqiad.wmnet
* 09:52 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1224.eqiad.wmnet
* 09:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1223.eqiad.wmnet
* 09:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1223.eqiad.wmnet
* 09:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1222.eqiad.wmnet
* 09:43 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add gw-virtual.ulsfo.wmnet - ayounsi@cumin1003"
* 09:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1222.eqiad.wmnet
* 09:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1221.eqiad.wmnet
* 09:32 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:32 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:28 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1221.eqiad.wmnet
* 09:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1220.eqiad.wmnet
* 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1220.eqiad.wmnet
* 09:02 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 08:38 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247990{{!}}Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580)]] (duration: 07m 07s)
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/airflow-main: apply
* 08:34 mszwarc@deploy2002: mszwarc: Continuing with sync
* 08:33 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1247990{{!}}Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:30 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1247990{{!}}Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580)]]
* 08:29 gehel@dns1004: END - running authdns-update
* 08:28 gehel@dns1004: START - running authdns-update
* 08:27 moritzm: installing mbedtls security updates
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 08:15 hashar@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248314{{!}}Revert "zhwiki: Add 2026 CNY celebration logos"]] (duration: 09m 19s)
* 08:11 hashar@deploy2002: hashar, stang: Continuing with sync
* 08:08 hashar@deploy2002: hashar, stang: Backport for [[gerrit:1248314{{!}}Revert "zhwiki: Add 2026 CNY celebration logos"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:06 hashar@deploy2002: Started scap sync-world: Backport for [[gerrit:1248314{{!}}Revert "zhwiki: Add 2026 CNY celebration logos"]]
* 08:02 moritzm: uploaded openjdk-8 8u482-ga-1~deb11u1 to component/jdk8 of bullseye-wikimedia
* 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts bast4005.wikimedia.org
* 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast4005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast4005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:48 moritzm: uploaded bird2 2.18-1~wmf13u2 to the main component of trixie-wikimedia [[phab:T413740|T413740]]
* 07:47 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:47 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 07:42 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts bast4005.wikimedia.org
* 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Remove es1033 [[phab:T408772|T408772]]', diff saved to https://phabricator.wikimedia.org/P89804 and previous config saved to /var/cache/conftool/dbconfig/20260305-063548-marostegui.json
* 02:10 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 55s)
* 02:02 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 02:01 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248163{{!}}Stop writing to il_to on medium size wikis (T415787)]] (duration: 06m 14s)
* 01:58 zabe@deploy2002: zabe: Continuing with sync
* 01:57 zabe@deploy2002: zabe: Backport for [[gerrit:1248163{{!}}Stop writing to il_to on medium size wikis (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:55 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248163{{!}}Stop writing to il_to on medium size wikis (T415787)]]
* 01:40 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1246099{{!}}Start reading from new file tables on medium wikis (T416548)]] (duration: 06m 15s)
* 01:36 zabe@deploy2002: zabe: Continuing with sync
* 01:36 zabe@deploy2002: zabe: Backport for [[gerrit:1246099{{!}}Start reading from new file tables on medium wikis (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:34 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1246099{{!}}Start reading from new file tables on medium wikis (T416548)]]
* 01:29 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248154{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]], [[gerrit:1248153{{!}}Revert^2 "ImageListPager: Properly support file schema migration read new"]] (duration: 07m 21s)
* 01:25 zabe@deploy2002: zabe: Continuing with sync
* 01:23 zabe@deploy2002: zabe: Backport for [[gerrit:1248154{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]], [[gerrit:1248153{{!}}Revert^2 "ImageListPager: Properly support file schema migration read new"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:21 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248154{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]], [[gerrit:1248153{{!}}Revert^2 "ImageListPager: Properly support file schema migration read new"]]
* 00:55 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248021{{!}}Stop writing to il_to on small wikis (T415787)]] (duration: 06m 49s)
* 00:51 zabe@deploy2002: zabe: Continuing with sync
* 00:50 zabe@deploy2002: zabe: Backport for [[gerrit:1248021{{!}}Stop writing to il_to on small wikis (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:48 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248021{{!}}Stop writing to il_to on small wikis (T415787)]]
* 00:19 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248125{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]], [[gerrit:1248123{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]] (duration: 08m 52s)
* 00:13 zabe@deploy2002: zabe: Continuing with sync
* 00:12 zabe@deploy2002: zabe: Backport for [[gerrit:1248125{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]], [[gerrit:1248123{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:10 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248125{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]], [[gerrit:1248123{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]]
== 2026-03-04 ==
* 22:57 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 22:56 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 22:55 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 22:55 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 22:55 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 22:54 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 22:35 tgr_: UTC late deploys done
* 22:33 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248084{{!}}Introduce a Semantic Search query route and builder (T413969)]], [[gerrit:1248085{{!}}Wire up semantic query building (T413969)]] (duration: 38m 28s)
* 22:16 tgr@deploy2002: tgr, ebernhardson: Continuing with sync
* 22:14 tgr@deploy2002: tgr, ebernhardson: Backport for [[gerrit:1248084{{!}}Introduce a Semantic Search query route and builder (T413969)]], [[gerrit:1248085{{!}}Wire up semantic query building (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248084{{!}}Introduce a Semantic Search query route and builder (T413969)]], [[gerrit:1248085{{!}}Wire up semantic query building (T413969)]]
* 21:48 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248012{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999)]] (duration: 07m 05s)
* 21:47 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on dse-k8s-worker1028.eqiad.wmnet with reason: broken networking
* 21:44 tgr@deploy2002: tgr: Continuing with sync
* 21:43 tgr@deploy2002: tgr: Backport for [[gerrit:1248012{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:40 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248012{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999)]]
* 21:36 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248081{{!}}Add synthetic AAA experiment (T418614)]], [[gerrit:1248080{{!}}Add synthetic AAA experiment (T418614)]] (duration: 09m 11s)
* 21:35 bking@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
* 21:32 tgr@deploy2002: cjming, tgr: Continuing with sync
* 21:30 bking@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
* 21:29 tgr@deploy2002: cjming, tgr: Backport for [[gerrit:1248081{{!}}Add synthetic AAA experiment (T418614)]], [[gerrit:1248080{{!}}Add synthetic AAA experiment (T418614)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:27 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248081{{!}}Add synthetic AAA experiment (T418614)]], [[gerrit:1248080{{!}}Add synthetic AAA experiment (T418614)]]
* 21:21 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1245473{{!}}logging: set poolcounter channel log level to info (T418612)]] (duration: 09m 04s)
* 21:17 tgr@deploy2002: tgr, cwhite: Continuing with sync
* 21:14 tgr@deploy2002: tgr, cwhite: Backport for [[gerrit:1245473{{!}}logging: set poolcounter channel log level to info (T418612)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:12 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1245473{{!}}logging: set poolcounter channel log level to info (T418612)]]
* 21:07 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248007{{!}}Fix $wgJwtSessionCookieIssuer (T415007 T418999)]] (duration: 09m 55s)
* 21:03 tgr@deploy2002: tgr: Continuing with sync
* 20:59 tgr@deploy2002: tgr: Backport for [[gerrit:1248007{{!}}Fix $wgJwtSessionCookieIssuer (T415007 T418999)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:57 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248007{{!}}Fix $wgJwtSessionCookieIssuer (T415007 T418999)]]
* 19:56 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 19:44 jhuneidi@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248011{{!}}CategoryViewer: Fall back to empty string in case of missing nextpage (T418934)]] (duration: 10m 47s)
* 19:44 cdobbins@cumin2002: conftool action : set/pooled=yes:weight=1; selector: name=cp205[0-8].codfw.wmnet
* 19:43 cdobbins@cumin2002: conftool action : set/pooled=yes:weight=1; selector: name=cp2049.codfw.wmnet
* 19:40 jhuneidi@deploy2002: zabe, jhuneidi: Continuing with sync
* 19:35 jhuneidi@deploy2002: zabe, jhuneidi: Backport for [[gerrit:1248011{{!}}CategoryViewer: Fall back to empty string in case of missing nextpage (T418934)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:34 brett@puppetserver1001: conftool action : set/weight=1; selector: name=cp2043.*
* 19:34 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
* 19:33 jhuneidi@deploy2002: Started scap sync-world: Backport for [[gerrit:1248011{{!}}CategoryViewer: Fall back to empty string in case of missing nextpage (T418934)]]
* 19:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2043.codfw.wmnet with OS trixie
* 19:23 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 19:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 19:22 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 19:22 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 19:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2043.codfw.wmnet with reason: host reimage
* 19:06 brett@puppetserver1001: conftool action : set/weight=1; selector: name=cp204[45678].*
* 19:04 brett@puppetserver1001: conftool action : set/weight=100; selector: name=cp204[45678].*
* 19:02 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2043.codfw.wmnet with reason: host reimage
* 18:58 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp204[45678].*
* 18:52 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 18:51 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 18:50 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 18:50 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 18:49 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:49 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:49 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 18:48 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 18:48 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 18:47 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 18:47 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 18:47 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS trixie
* 18:46 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 18:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 18:42 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 18:41 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 18:41 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 18:39 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 18:39 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 18:38 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 18:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 18:37 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:32 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:16 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:16 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:16 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:15 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:15 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:14 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:14 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:13 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:12 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:12 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2047.codfw.wmnet with OS trixie
* 17:33 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 17:27 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 17:23 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:23 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:18 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:18 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:15 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:13 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:09 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp2047.codfw.wmnet with OS trixie
* 16:55 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:55 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:54 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:54 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1007.eqiad.wmnet with OS bookworm
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-unlock-scap (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:39 root@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Datacenter switchover from eqiad to codfw - [[phab:T418133|T418133]] (duration: 25m 37s)
* 16:39 root@deploy2002: Forcefully removing global lock: Datacenter switchover from eqiad to codfw - [[phab:T418133|T418133]]
* 16:39 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-unlock-scap for datacenter switchover from eqiad to codfw
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:27 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters for datacenter switchover from eqiad to codfw
* 16:27 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-restore-ttl (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:26 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-restore-ttl for datacenter switchover from eqiad to codfw
* 16:26 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:26 root@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 16:26 root@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 16:26 root@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 16:26 root@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 16:26 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance for datacenter switchover from eqiad to codfw
* 16:25 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:25 root@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: sync
* 16:25 root@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: sync
* 16:25 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: [DRY-RUN] MediaWiki read-only period ends at: 2026-03-04 16:24:40.502004
* 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.07-set-readwrite for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki for datacenter switchover from eqiad to codfw
* 16:23 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:23 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.03-set-db-readonly for datacenter switchover from eqiad to codfw
* 16:23 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:22 blake@cumin1003: [DRY-RUN] MediaWiki read-only period starts at: 2026-03-04 16:22:41.755892
* 16:22 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.02-set-readonly for datacenter switchover from eqiad to codfw
* 16:20 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 16:20 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:20 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance for datacenter switchover from eqiad to codfw
* 16:19 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:14 moritzm: upgrading cloudservices* to Bird 2.18 [[phab:T413740|T413740]]
* 16:14 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl for datacenter switchover from eqiad to codfw
* 16:13 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-lock-scap (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:13 root@deploy2002: Locking from deployment [ALL REPOSITORIES]: Datacenter switchover from eqiad to codfw - [[phab:T418133|T418133]]
* 16:13 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-lock-scap for datacenter switchover from eqiad to codfw
* 16:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:10 moritzm: remove ganeti4005 from ganeti/ulsfo cluster [[phab:T418993|T418993]]
* 16:10 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1007.eqiad.wmnet with OS bookworm
* 16:06 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:06 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks for datacenter switchover from eqiad to codfw
* 15:59 XioNoX: push pfw policies - [[phab:T418402|T418402]]
* 15:37 sukhe@dns1004: END - running authdns-update
* 15:36 sukhe@dns1004: START - running authdns-update
* 15:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1219.eqiad.wmnet
* 15:32 aqu@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 15:31 aqu@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 15:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
* 15:29 cgoubert@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>ms-fe10[14-24].*<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 15:24 cgoubert@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P<nowiki>{</nowiki>ms-fe10[14-24].*<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 15:22 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:22 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:22 cgoubert@cumin1003: END (ERROR) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=97) rolling restart_daemons on A:swift-fe-eqiad
* 15:21 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1219.eqiad.wmnet
* 15:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1218.eqiad.wmnet
* 15:19 cgoubert@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
* 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1120.eqiad.wmnet
* 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1121.eqiad.wmnet
* 15:16 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1115.eqiad.wmnet [reason: [[phab:T418772|T418772]] - BGP maintenance]
* 15:16 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1122.eqiad.wmnet
* 15:15 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:15 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:14 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:13 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:13 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:10 XioNoX: lsw1-d7-eqiad# tools network-instance default protocols bgp neighbor 10.64.128.17 reset-peer - [[phab:T418772|T418772]]
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
* 15:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1218.eqiad.wmnet
* 15:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1217.eqiad.wmnet
* 15:09 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:05 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:05 moritzm: upgrading cloudlb* to Bird 2.18 [[phab:T413740|T413740]]
* 15:05 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:04 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:58 Dreamy_Jazz: Afternoon UTC backport window done
* 14:58 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244373{{!}}zhwiki: Remove all rights from accountcreator (T418089)]] (duration: 08m 12s)
* 14:57 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:57 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1217.eqiad.wmnet
* 14:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1216.eqiad.wmnet
* 14:57 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:56 btullis@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on dse-k8s-worker[1010-1011,1013,1018-1019].eqiad.wmnet with reason: Adding 10 Gbps NIC
* 14:54 dreamyjazz@deploy2002: dreamyjazz, 1f616emo: Continuing with sync
* 14:52 jmm@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
* 14:52 dreamyjazz@deploy2002: dreamyjazz, 1f616emo: Backport for [[gerrit:1244373{{!}}zhwiki: Remove all rights from accountcreator (T418089)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:50 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1244373{{!}}zhwiki: Remove all rights from accountcreator (T418089)]]
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1216.eqiad.wmnet
* 14:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1215.eqiad.wmnet
* 14:44 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248009{{!}}Hooks: Fix liquidthreads log type definition bugs (T417425 T419006)]], [[gerrit:1248008{{!}}Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425)]] (duration: 07m 11s)
* 14:44 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1115.eqiad.wmnet [reason: [[phab:T418772|T418772]] - BGP maintenance]
* 14:44 taavi: updating CR firewall policy with https://gerrit.wikimedia.org/r/c/operations/homer/public/+/970275
* 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1122.eqiad.wmnet
* 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1121.eqiad.wmnet
* 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1120.eqiad.wmnet
* 14:40 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 14:39 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1248009{{!}}Hooks: Fix liquidthreads log type definition bugs (T417425 T419006)]], [[gerrit:1248008{{!}}Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:37 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1248009{{!}}Hooks: Fix liquidthreads log type definition bugs (T417425 T419006)]], [[gerrit:1248008{{!}}Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425)]]
* 14:33 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1215.eqiad.wmnet
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1214.eqiad.wmnet
* 14:32 btullis@puppetserver1001: conftool action : get/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1025.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1025.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
* 14:30 btullis@puppetserver1001: conftool action : get/pooled; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
* 14:29 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_esams and A:cp - 3.0 upgrade ()
* 14:27 arnaudb@dns1004: END - running authdns-update
* 14:26 arnaudb@dns1004: START - running authdns-update
* 14:26 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248000{{!}}Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999)]] (duration: 07m 19s)
* 14:22 tgr@deploy2002: tgr: Continuing with sync
* 14:21 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1214.eqiad.wmnet
* 14:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1213.eqiad.wmnet
* 14:21 tgr@deploy2002: tgr: Backport for [[gerrit:1248000{{!}}Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:19 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248000{{!}}Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999)]]
* 14:14 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247566{{!}}Enable new HTML confirmation emails for all (T416748)]] (duration: 07m 46s)
* 14:13 elukey@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:13 elukey@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:10 sgimeno@deploy2002: migr, sgimeno: Continuing with sync
* 14:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1213.eqiad.wmnet
* 14:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1212.eqiad.wmnet
* 14:09 sgimeno@deploy2002: migr, sgimeno: Backport for [[gerrit:1247566{{!}}Enable new HTML confirmation emails for all (T416748)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 elukey@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 14:08 elukey@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 14:08 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:07 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:07 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1247566{{!}}Enable new HTML confirmation emails for all (T416748)]]
* 13:59 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
* 13:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
* 13:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
* 13:57 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1212.eqiad.wmnet
* 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1211.eqiad.wmnet
* 13:49 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_esams and A:cp - 3.0 upgrade ()
* 13:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1211.eqiad.wmnet
* 13:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1210.eqiad.wmnet
* 13:43 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_esams and A:cp - 3.0 upgrade ()
* 13:40 arnaudb@dns1004: END - running authdns-update
* 13:39 arnaudb@dns1004: START - running authdns-update
* 13:37 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 13:33 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1210.eqiad.wmnet
* 13:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1209.eqiad.wmnet
* 13:20 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1209.eqiad.wmnet
* 13:20 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1208.eqiad.wmnet
* 13:17 aokoth@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:17 aokoth@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:16 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:15 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1208.eqiad.wmnet
* 13:06 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1207.eqiad.wmnet
* 13:03 arnaudb@dns1005: END - running authdns-update
* 13:02 arnaudb@dns1005: START - running authdns-update
* 13:00 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_esams and A:cp - 3.0 upgrade ()
* 13:00 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 12:46 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 12:45 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 12:44 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 12:44 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
* 12:43 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 12:43 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
* 12:33 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 12:29 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
* 12:10 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 12:08 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp - 3.0 upgrade ()
* 12:03 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1207.eqiad.wmnet
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1206.eqiad.wmnet
* 11:53 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1206.eqiad.wmnet
* 11:53 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1205.eqiad.wmnet
* 11:36 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f8-eqiad
* 11:36 jmm@cumin2002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
* 11:34 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp - 3.0 upgrade ()
* 11:34 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 11:28 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247968{{!}}SI: Update instrumentation schema (T418293)]] (duration: 16m 22s)
* 11:22 fabfur: start upgrading haproxy to 3.0 on A:cp-eqiad ([[phab:T417253|T417253]])
* 11:22 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 11:17 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1247968{{!}}SI: Update instrumentation schema (T418293)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:13 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp - 3.0 upgrade ()
* 11:12 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1247968{{!}}SI: Update instrumentation schema (T418293)]]
* 11:08 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp - 3.0 upgrade ()
* 11:07 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 11:07 blake@cumin1003: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 11:06 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2356].codfw.wmnet
* 11:06 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2356].codfw.wmnet
* 11:03 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 11:03 blake@cumin1003: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 10:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2356].codfw.wmnet
* 10:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2356].codfw.wmnet
* 10:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1205.eqiad.wmnet
* 10:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1204.eqiad.wmnet
* 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:42 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1204.eqiad.wmnet
* 10:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1203.eqiad.wmnet
* 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1203.eqiad.wmnet
* 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1202.eqiad.wmnet
* 10:25 fabfur: start upgrading haproxy to 3.0 on A:cp-drmrs ([[phab:T417253|T417253]])
* 10:25 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp - 3.0 upgrade ()
* 10:25 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp - 3.0 upgrade ()
* 10:24 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247941{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]], [[gerrit:1247944{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]] (duration: 06m 42s)
* 10:22 arnaudb@dns1004: END - running authdns-update
* 10:20 arnaudb@dns1004: START - running authdns-update
* 10:20 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 10:20 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1247941{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]], [[gerrit:1247944{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:18 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1247941{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]], [[gerrit:1247944{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]]
* 10:16 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1202.eqiad.wmnet
* 10:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1201.eqiad.wmnet
* 10:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:04 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1201.eqiad.wmnet
* 10:04 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1200.eqiad.wmnet
* 09:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1200.eqiad.wmnet
* 09:39 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247925{{!}}Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880)]] (duration: 08m 23s)
* 09:36 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_codfw and A:cp - 3.0 upgrade ()
* 09:35 mszwarc@deploy2002: mszwarc: Continuing with sync
* 09:33 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1247925{{!}}Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet
* 09:31 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp - 3.0 upgrade ()
* 09:31 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1247925{{!}}Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880)]]
* 09:30 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:03 gehel: switching off Blazegraph on wdqs2009 (legacy full graph endpoint is end of life) - [[phab:T411410|T411410]] / [[phab:T415073|T415073]]
* 09:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:02 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet
* 09:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 08:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 08:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:56 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet
* 08:54 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 08:52 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 08:49 topranks: disabling IBGP session between ssw1-d1-eqiad and ssw1-d8-eqiad to remove backup paths try #2 [[phab:T411054|T411054]]
* 08:36 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on backup1007.eqiad.wmnet,dbprov1004.eqiad.wmnet with reason: network maintenance
* 08:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:31 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:21 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp - 3.0 upgrade ()
* 08:21 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_codfw and A:cp - 3.0 upgrade ()
* 08:11 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5032.*
* 07:54 topranks: disabling IBGP session between ssw1-d1-eqiad and ssw1-d8-eqiad to remove backup paths [[phab:T411054|T411054]]
* 07:43 moritzm: installing libbpf updates from Bookworm point release
* 05:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 05:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 04s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 01:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89793 and previous config saved to /var/cache/conftool/dbconfig/20260304-015657-marostegui.json
* 01:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P89792 and previous config saved to /var/cache/conftool/dbconfig/20260304-014150-marostegui.json
* 01:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P89791 and previous config saved to /var/cache/conftool/dbconfig/20260304-012642-marostegui.json
* 01:23 zabe@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 01:22 zabe@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89790 and previous config saved to /var/cache/conftool/dbconfig/20260304-011134-marostegui.json
* 00:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1263 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89789 and previous config saved to /var/cache/conftool/dbconfig/20260304-004638-marostegui.json
* 00:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1263.eqiad.wmnet with reason: Maintenance
* 00:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89788 and previous config saved to /var/cache/conftool/dbconfig/20260304-004615-marostegui.json
* 00:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P89787 and previous config saved to /var/cache/conftool/dbconfig/20260304-003107-marostegui.json
* 00:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P89786 and previous config saved to /var/cache/conftool/dbconfig/20260304-001559-marostegui.json
* 00:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89785 and previous config saved to /var/cache/conftool/dbconfig/20260304-000052-marostegui.json
== 2026-03-03 ==
* 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1262 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89784 and previous config saved to /var/cache/conftool/dbconfig/20260303-233500-marostegui.json
* 23:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1262.eqiad.wmnet with reason: Maintenance
* 23:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89783 and previous config saved to /var/cache/conftool/dbconfig/20260303-233436-marostegui.json
* 23:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P89782 and previous config saved to /var/cache/conftool/dbconfig/20260303-231929-marostegui.json
* 23:10 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 23:08 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 23:08 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:07 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:05 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 23:05 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 23:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P89781 and previous config saved to /var/cache/conftool/dbconfig/20260303-230421-marostegui.json
* 23:04 bking@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
* 23:02 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247689{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247690{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247596{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007)]] (duration: 21m 47s)
* 23:00 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7008.magru.wmnet [reason: lldpd packet drop issues]
* 22:58 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7008 [reason: lldpd packet drop issues]
* 22:58 tgr@deploy2002: tgr: Continuing with sync
* 22:56 bking@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
* 22:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89780 and previous config saved to /var/cache/conftool/dbconfig/20260303-224913-marostegui.json
* 22:45 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:45 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:44 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:44 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:42 tgr@deploy2002: tgr: Backport for [[gerrit:1247689{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247690{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247596{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1247689{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247690{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247596{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007)]]
* 22:26 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 22:26 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 22:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1261 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89779 and previous config saved to /var/cache/conftool/dbconfig/20260303-222324-marostegui.json
* 22:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1261.eqiad.wmnet with reason: Maintenance
* 22:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89778 and previous config saved to /var/cache/conftool/dbconfig/20260303-222301-marostegui.json
* 22:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P89777 and previous config saved to /var/cache/conftool/dbconfig/20260303-220754-marostegui.json
* 21:59 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1245162 [[phab:T411807|T411807]] (duration: 12m 15s)
* 21:58 rzl@deploy2002: rzl: Continuing with sync
* 21:56 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:56 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:55 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1245162 [[phab:T411807|T411807]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1245162 [[phab:T411807|T411807]]
* 21:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P89776 and previous config saved to /var/cache/conftool/dbconfig/20260303-215247-marostegui.json
* 21:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89775 and previous config saved to /var/cache/conftool/dbconfig/20260303-214931-marostegui.json
* 21:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2045.codfw.wmnet
* 21:48 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp2045.codfw.wmnet
* 21:40 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:39 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89774 and previous config saved to /var/cache/conftool/dbconfig/20260303-213739-marostegui.json
* 21:35 jhuneidi@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244748{{!}}REST: show the beta Attribution API in the REST Sandbox (T418522)]] (duration: 07m 41s)
* 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P89773 and previous config saved to /var/cache/conftool/dbconfig/20260303-213423-marostegui.json
* 21:32 jhuneidi@deploy2002: jhuneidi, bpirkle: Continuing with sync
* 21:30 jhuneidi@deploy2002: jhuneidi, bpirkle: Backport for [[gerrit:1244748{{!}}REST: show the beta Attribution API in the REST Sandbox (T418522)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:28 jhuneidi@deploy2002: Started scap sync-world: Backport for [[gerrit:1244748{{!}}REST: show the beta Attribution API in the REST Sandbox (T418522)]]
* 21:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P89772 and previous config saved to /var/cache/conftool/dbconfig/20260303-211915-marostegui.json
* 21:18 jhuneidi@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247652{{!}}Remove redundant mw-extra wgRestSandboxSpecs entry]] (duration: 06m 56s)
* 21:14 jhuneidi@deploy2002: jhuneidi, aaron: Continuing with sync
* 21:13 jhuneidi@deploy2002: jhuneidi, aaron: Backport for [[gerrit:1247652{{!}}Remove redundant mw-extra wgRestSandboxSpecs entry]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:11 jhuneidi@deploy2002: Started scap sync-world: Backport for [[gerrit:1247652{{!}}Remove redundant mw-extra wgRestSandboxSpecs entry]]
* 21:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1260 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89771 and previous config saved to /var/cache/conftool/dbconfig/20260303-211033-marostegui.json
* 21:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1260.eqiad.wmnet with reason: Maintenance
* 21:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89770 and previous config saved to /var/cache/conftool/dbconfig/20260303-211009-marostegui.json
* 21:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89769 and previous config saved to /var/cache/conftool/dbconfig/20260303-210407-marostegui.json
* 20:58 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2045.codfw.wmnet with reason: troubleshooting for [[phab:T418527|T418527]]
* 20:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P89768 and previous config saved to /var/cache/conftool/dbconfig/20260303-205502-marostegui.json
* 20:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7008.magru.wmnet with OS trixie
* 20:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89767 and previous config saved to /var/cache/conftool/dbconfig/20260303-204452-marostegui.json
* 20:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2248.codfw.wmnet with reason: Maintenance
* 20:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89766 and previous config saved to /var/cache/conftool/dbconfig/20260303-204439-marostegui.json
* 20:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P89765 and previous config saved to /var/cache/conftool/dbconfig/20260303-203954-marostegui.json
* 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P89764 and previous config saved to /var/cache/conftool/dbconfig/20260303-202931-marostegui.json
* 20:24 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7008.magru.wmnet with reason: host reimage
* 20:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89763 and previous config saved to /var/cache/conftool/dbconfig/20260303-202447-marostegui.json
* 20:17 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7008.magru.wmnet with reason: host reimage
* 20:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P89762 and previous config saved to /var/cache/conftool/dbconfig/20260303-201423-marostegui.json
* 20:10 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1199.eqiad.wmnet
* 19:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89761 and previous config saved to /var/cache/conftool/dbconfig/20260303-195916-marostegui.json
* 19:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1252 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89760 and previous config saved to /var/cache/conftool/dbconfig/20260303-195900-marostegui.json
* 19:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1252.eqiad.wmnet with reason: Maintenance
* 19:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89759 and previous config saved to /var/cache/conftool/dbconfig/20260303-195835-marostegui.json
* 19:51 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7008.magru.wmnet with OS trixie
* 19:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P89758 and previous config saved to /var/cache/conftool/dbconfig/20260303-194327-marostegui.json
* 19:42 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2043.codfw.wmnet
* 19:42 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp2043.codfw.wmnet
* 19:33 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89757 and previous config saved to /var/cache/conftool/dbconfig/20260303-193351-marostegui.json
* 19:33 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 19:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89756 and previous config saved to /var/cache/conftool/dbconfig/20260303-193338-marostegui.json
* 19:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P89755 and previous config saved to /var/cache/conftool/dbconfig/20260303-192820-marostegui.json
* 19:19 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 19:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P89754 and previous config saved to /var/cache/conftool/dbconfig/20260303-191830-marostegui.json
* 19:17 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2047.codfw.wmnet with OS trixie
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89753 and previous config saved to /var/cache/conftool/dbconfig/20260303-191312-marostegui.json
* 19:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P89752 and previous config saved to /var/cache/conftool/dbconfig/20260303-190323-marostegui.json
* 18:53 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 18:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
* 18:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1198.eqiad.wmnet
* 18:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1249 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89751 and previous config saved to /var/cache/conftool/dbconfig/20260303-184937-marostegui.json
* 18:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1249.eqiad.wmnet with reason: Maintenance
* 18:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89750 and previous config saved to /var/cache/conftool/dbconfig/20260303-184913-marostegui.json
* 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89749 and previous config saved to /var/cache/conftool/dbconfig/20260303-184815-marostegui.json
* 18:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 18:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1096.eqiad.wmnet with OS bullseye
* 18:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1198.eqiad.wmnet
* 18:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1197.eqiad.wmnet
* 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P89747 and previous config saved to /var/cache/conftool/dbconfig/20260303-183406-marostegui.json
* 18:29 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp2047.codfw.wmnet with OS trixie
* 18:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1197.eqiad.wmnet
* 18:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1196.eqiad.wmnet
* 18:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89746 and previous config saved to /var/cache/conftool/dbconfig/20260303-182346-marostegui.json
* 18:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1096.eqiad.wmnet with reason: host reimage
* 18:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 18:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89745 and previous config saved to /var/cache/conftool/dbconfig/20260303-182321-marostegui.json
* 18:19 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1096.eqiad.wmnet with reason: host reimage
* 18:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P89744 and previous config saved to /var/cache/conftool/dbconfig/20260303-181859-marostegui.json
* 18:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1196.eqiad.wmnet
* 18:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1195.eqiad.wmnet
* 18:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P89743 and previous config saved to /var/cache/conftool/dbconfig/20260303-180814-marostegui.json
* 18:04 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247635{{!}}Style fixes for copy-paste feature (T414072)]] (duration: 32m 54s)
* 18:04 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89742 and previous config saved to /var/cache/conftool/dbconfig/20260303-180352-marostegui.json
* 18:02 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1096.eqiad.wmnet with OS bullseye
* 18:02 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1195.eqiad.wmnet
* 17:59 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host an-worker1194.eqiad.wmnet
* 17:55 ariel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:53 ariel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P89741 and previous config saved to /var/cache/conftool/dbconfig/20260303-175304-marostegui.json
* 17:52 jforrester@deploy2002: jforrester: Continuing with sync
* 17:51 jforrester@deploy2002: jforrester: Backport for [[gerrit:1247635{{!}}Style fixes for copy-paste feature (T414072)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:47 ariel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:46 ariel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1194.eqiad.wmnet
* 17:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1193.eqiad.wmnet
* 17:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89740 and previous config saved to /var/cache/conftool/dbconfig/20260303-173914-marostegui.json
* 17:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1248.eqiad.wmnet with reason: Maintenance
* 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89739 and previous config saved to /var/cache/conftool/dbconfig/20260303-173850-marostegui.json
* 17:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89738 and previous config saved to /var/cache/conftool/dbconfig/20260303-173756-marostegui.json
* 17:31 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1247635{{!}}Style fixes for copy-paste feature (T414072)]]
* 17:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1193.eqiad.wmnet
* 17:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1192.eqiad.wmnet
* 17:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P89736 and previous config saved to /var/cache/conftool/dbconfig/20260303-172343-marostegui.json
* 17:18 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1192.eqiad.wmnet
* 17:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1191.eqiad.wmnet
* 17:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89735 and previous config saved to /var/cache/conftool/dbconfig/20260303-171149-marostegui.json
* 17:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 17:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89734 and previous config saved to /var/cache/conftool/dbconfig/20260303-171126-marostegui.json
* 17:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P89733 and previous config saved to /var/cache/conftool/dbconfig/20260303-170835-marostegui.json
* 17:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1191.eqiad.wmnet
* 17:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1190.eqiad.wmnet
* 16:56 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1190.eqiad.wmnet
* 16:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P89732 and previous config saved to /var/cache/conftool/dbconfig/20260303-165618-marostegui.json
* 16:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89731 and previous config saved to /var/cache/conftool/dbconfig/20260303-165327-marostegui.json
* 16:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1189.eqiad.wmnet
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P89730 and previous config saved to /var/cache/conftool/dbconfig/20260303-164111-marostegui.json
* 16:34 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1189.eqiad.wmnet
* 16:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1188.eqiad.wmnet
* 16:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89729 and previous config saved to /var/cache/conftool/dbconfig/20260303-162845-marostegui.json
* 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Setting x1 codfw weights to 300 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89728 and previous config saved to /var/cache/conftool/dbconfig/20260303-162836-fceratto.json
* 16:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1247.eqiad.wmnet with reason: Maintenance
* 16:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89727 and previous config saved to /var/cache/conftool/dbconfig/20260303-162603-marostegui.json
* 16:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 16:18 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1188 weight to 100 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89726 and previous config saved to /var/cache/conftool/dbconfig/20260303-161846-fceratto.json
* 16:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 16:17 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1188.eqiad.wmnet
* 16:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1187.eqiad.wmnet
* 16:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1166: testing:crash
* 16:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1166: testing:crash
* 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1169 weight to 300 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89724 and previous config saved to /var/cache/conftool/dbconfig/20260303-161323-fceratto.json
* 16:12 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1188 weight to 300 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89723 and previous config saved to /var/cache/conftool/dbconfig/20260303-161230-fceratto.json
* 16:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 16:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89722 and previous config saved to /var/cache/conftool/dbconfig/20260303-160720-marostegui.json
* 16:07 brennen@deploy2002: Finished deploy [phabricator/deployment@a883b6d]: deploy phab1004 for [[phab:T418872|T418872]] (duration: 01m 07s)
* 16:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1187.eqiad.wmnet
* 16:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1186.eqiad.wmnet
* 16:05 brennen@deploy2002: Started deploy [phabricator/deployment@a883b6d]: deploy phab1004 for [[phab:T418872|T418872]]
* 16:05 brennen@deploy2002: Finished deploy [phabricator/deployment@a883b6d]: deploy phab2002 for [[phab:T418872|T418872]] (duration: 00m 32s)
* 16:04 brennen@deploy2002: Started deploy [phabricator/deployment@a883b6d]: deploy phab2002 for [[phab:T418872|T418872]]
* 16:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2240 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89721 and previous config saved to /var/cache/conftool/dbconfig/20260303-160207-marostegui.json
* 16:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 16:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2240.codfw.wmnet with reason: Maintenance
* 16:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 16:00 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247569{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]] (duration: 09m 28s)
* 15:54 zabe@deploy2002: zabe: Continuing with sync
* 15:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1186.eqiad.wmnet
* 15:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1185.eqiad.wmnet
* 15:54 zabe@deploy2002: zabe: Backport for [[gerrit:1247569{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:53 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 15:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P89720 and previous config saved to /var/cache/conftool/dbconfig/20260303-155212-marostegui.json
* 15:50 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1247569{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]]
* 15:49 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 15:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:42 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1185.eqiad.wmnet
* 15:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1184.eqiad.wmnet
* 15:42 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:41 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 15:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 15:41 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 15:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89719 and previous config saved to /var/cache/conftool/dbconfig/20260303-154104-marostegui.json
* 15:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P89718 and previous config saved to /var/cache/conftool/dbconfig/20260303-153704-marostegui.json
* 15:36 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 15:36 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 15:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1184.eqiad.wmnet
* 15:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1183.eqiad.wmnet
* 15:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P89717 and previous config saved to /var/cache/conftool/dbconfig/20260303-152557-marostegui.json
* 15:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1178.eqiad.wmnet
* 15:22 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp5032.*<nowiki>}</nowiki> and A:cp - 3.0 upgrade ()
* 15:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89716 and previous config saved to /var/cache/conftool/dbconfig/20260303-152157-marostegui.json
* 15:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1183.eqiad.wmnet
* 15:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1182.eqiad.wmnet
* 15:16 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp5032.*<nowiki>}</nowiki> and A:cp - 3.0 upgrade ()
* 15:15 fabfur@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=1) rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 15:14 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 15:14 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 15:13 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 15:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 15:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P89715 and previous config saved to /var/cache/conftool/dbconfig/20260303-151049-marostegui.json
* 15:08 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:07 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1182.eqiad.wmnet
* 15:07 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1181.eqiad.wmnet
* 14:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1244 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89714 and previous config saved to /var/cache/conftool/dbconfig/20260303-145727-marostegui.json
* 14:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1244.eqiad.wmnet with reason: Maintenance
* 14:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89713 and previous config saved to /var/cache/conftool/dbconfig/20260303-145704-marostegui.json
* 14:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89712 and previous config saved to /var/cache/conftool/dbconfig/20260303-145541-marostegui.json
* 14:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1181.eqiad.wmnet
* 14:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1180.eqiad.wmnet
* 14:49 moritzm: installing php7.4 security updates
* 14:46 jayme@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 14:46 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 14:43 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1180.eqiad.wmnet
* 14:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1179.eqiad.wmnet
* 14:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P89711 and previous config saved to /var/cache/conftool/dbconfig/20260303-144156-marostegui.json
* 14:38 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 14:38 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240716{{!}}Remove Editing-related config for special wikis (T400063)]] (duration: 06m 34s)
* 14:36 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:34 esanders@deploy2002: esanders: Continuing with sync
* 14:34 esanders@deploy2002: esanders: Backport for [[gerrit:1240716{{!}}Remove Editing-related config for special wikis (T400063)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:34 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:32 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1240716{{!}}Remove Editing-related config for special wikis (T400063)]]
* 14:31 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1179.eqiad.wmnet
* 14:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1178.eqiad.wmnet
* 14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89710 and previous config saved to /var/cache/conftool/dbconfig/20260303-143141-marostegui.json
* 14:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89709 and previous config saved to /var/cache/conftool/dbconfig/20260303-143117-marostegui.json
* 14:29 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247578{{!}}PasteCheck: Enable by default (T405127)]] (duration: 08m 01s)
* 14:27 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 14:27 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 14:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P89708 and previous config saved to /var/cache/conftool/dbconfig/20260303-142649-marostegui.json
* 14:26 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 14:25 esanders@deploy2002: esanders: Continuing with sync
* 14:23 esanders@deploy2002: esanders: Backport for [[gerrit:1247578{{!}}PasteCheck: Enable by default (T405127)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:21 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1247578{{!}}PasteCheck: Enable by default (T405127)]]
* 14:20 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 14:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P89707 and previous config saved to /var/cache/conftool/dbconfig/20260303-141610-marostegui.json
* 14:15 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247576{{!}}Enable Wikibase GraphQL on test.wikidata.org (T417619)]], [[gerrit:1247577{{!}}Enable Wikibase GraphQL on production wikidata.org (T417619)]] (duration: 08m 17s)
* 14:11 esanders@deploy2002: esanders, jakob: Continuing with sync
* 14:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89706 and previous config saved to /var/cache/conftool/dbconfig/20260303-141142-marostegui.json
* 14:09 esanders@deploy2002: esanders, jakob: Backport for [[gerrit:1247576{{!}}Enable Wikibase GraphQL on test.wikidata.org (T417619)]], [[gerrit:1247577{{!}}Enable Wikibase GraphQL on production wikidata.org (T417619)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1247576{{!}}Enable Wikibase GraphQL on test.wikidata.org (T417619)]], [[gerrit:1247577{{!}}Enable Wikibase GraphQL on production wikidata.org (T417619)]]
* 14:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P89704 and previous config saved to /var/cache/conftool/dbconfig/20260303-140102-marostegui.json
* 13:47 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1243 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89703 and previous config saved to /var/cache/conftool/dbconfig/20260303-134702-marostegui.json
* 13:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1243.eqiad.wmnet with reason: Maintenance
* 13:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89702 and previous config saved to /var/cache/conftool/dbconfig/20260303-134639-marostegui.json
* 13:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89701 and previous config saved to /var/cache/conftool/dbconfig/20260303-134554-marostegui.json
* 13:31 moritzm: installing NSS security updates
* 13:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P89700 and previous config saved to /var/cache/conftool/dbconfig/20260303-133131-marostegui.json
* 13:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89699 and previous config saved to /var/cache/conftool/dbconfig/20260303-132414-marostegui.json
* 13:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 13:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89698 and previous config saved to /var/cache/conftool/dbconfig/20260303-132350-marostegui.json
* 13:20 tappof: Thanos: re-enable querier<->ruler cross-site traffic [[phab:T412924|T412924]]
* 13:17 dpogorzelski@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=recommendation-api,name=eqiad
* 13:17 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P89697 and previous config saved to /var/cache/conftool/dbconfig/20260303-131624-marostegui.json
* 13:16 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 13:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1177.eqiad.wmnet
* 13:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1359.eqiad.wmnet with OS trixie
* 13:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P89696 and previous config saved to /var/cache/conftool/dbconfig/20260303-130842-marostegui.json
* 13:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89695 and previous config saved to /var/cache/conftool/dbconfig/20260303-130117-marostegui.json
* 13:01 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:00 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1177.eqiad.wmnet
* 13:00 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1176.eqiad.wmnet
* 12:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1358.eqiad.wmnet with OS trixie
* 12:56 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:55 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:53 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1359.eqiad.wmnet with reason: host reimage
* 12:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P89694 and previous config saved to /var/cache/conftool/dbconfig/20260303-125335-marostegui.json
* 12:52 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1357.eqiad.wmnet with OS trixie
* 12:51 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:50 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:48 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1359.eqiad.wmnet with reason: host reimage
* 12:48 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1356.eqiad.wmnet with OS trixie
* 12:47 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:47 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 12:47 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:47 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1176.eqiad.wmnet
* 12:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1175.eqiad.wmnet
* 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 12:45 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 12:45 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'readability' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 12:43 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1358.eqiad.wmnet with reason: host reimage
* 12:42 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 12:42 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 12:41 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:40 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247559{{!}}Enable thumb steps on private wikis too (T414805)]] (duration: 13m 01s)
* 12:39 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89693 and previous config saved to /var/cache/conftool/dbconfig/20260303-123827-marostegui.json
* 12:36 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1357.eqiad.wmnet with reason: host reimage
* 12:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1242 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89692 and previous config saved to /var/cache/conftool/dbconfig/20260303-123642-marostegui.json
* 12:36 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1359.eqiad.wmnet with OS trixie
* 12:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1242.eqiad.wmnet with reason: Maintenance
* 12:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89691 and previous config saved to /var/cache/conftool/dbconfig/20260303-123619-marostegui.json
* 12:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1175.eqiad.wmnet
* 12:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1174.eqiad.wmnet
* 12:34 dpogorzelski@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=recommendation-api,name=eqiad
* 12:33 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 12:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1356.eqiad.wmnet with reason: host reimage
* 12:31 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:31 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:31 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:31 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1247559{{!}}Enable thumb steps on private wikis too (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:30 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1358.eqiad.wmnet with reason: host reimage
* 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1357.eqiad.wmnet with reason: host reimage
* 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1356.eqiad.wmnet with reason: host reimage
* 12:27 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:27 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1247559{{!}}Enable thumb steps on private wikis too (T414805)]]
* 12:26 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1174.eqiad.wmnet
* 12:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1173.eqiad.wmnet
* 12:21 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P89690 and previous config saved to /var/cache/conftool/dbconfig/20260303-122112-marostegui.json
* 12:20 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:20 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:19 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1353.eqiad.wmnet with OS trixie
* 12:16 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1358.eqiad.wmnet with OS trixie
* 12:16 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1357.eqiad.wmnet with OS trixie
* 12:15 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1356.eqiad.wmnet with OS trixie
* 12:14 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1355.eqiad.wmnet with OS trixie
* 12:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89689 and previous config saved to /var/cache/conftool/dbconfig/20260303-121420-marostegui.json
* 12:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 12:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89688 and previous config saved to /var/cache/conftool/dbconfig/20260303-121355-marostegui.json
* 12:09 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1354.eqiad.wmnet with OS trixie
* 12:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1173.eqiad.wmnet
* 12:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1172.eqiad.wmnet
* 12:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P89687 and previous config saved to /var/cache/conftool/dbconfig/20260303-120604-marostegui.json
* 12:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1352.eqiad.wmnet with OS trixie
* 12:02 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1353.eqiad.wmnet with reason: host reimage
* 11:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P89686 and previous config saved to /var/cache/conftool/dbconfig/20260303-115847-marostegui.json
* 11:58 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1355.eqiad.wmnet with reason: host reimage
* 11:52 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1354.eqiad.wmnet with reason: host reimage
* 11:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89685 and previous config saved to /var/cache/conftool/dbconfig/20260303-115057-marostegui.json
* 11:48 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1352.eqiad.wmnet with reason: host reimage
* 11:44 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1355.eqiad.wmnet with reason: host reimage
* 11:43 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1354.eqiad.wmnet with reason: host reimage
* 11:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P89684 and previous config saved to /var/cache/conftool/dbconfig/20260303-114341-marostegui.json
* 11:43 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1353.eqiad.wmnet with reason: host reimage
* 11:42 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1352.eqiad.wmnet with reason: host reimage
* 11:40 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 11:36 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 11:31 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1355.eqiad.wmnet with OS trixie
* 11:31 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1354.eqiad.wmnet with OS trixie
* 11:30 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1353.eqiad.wmnet with OS trixie
* 11:30 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1352.eqiad.wmnet with OS trixie
* 11:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T418465|T418465]])', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260303-112828-marostegui.json
* 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1241 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89683 and previous config saved to /var/cache/conftool/dbconfig/20260303-112535-marostegui.json
* 11:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1241.eqiad.wmnet with reason: Maintenance
* 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89682 and previous config saved to /var/cache/conftool/dbconfig/20260303-112511-marostegui.json
* 11:21 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:18 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:18 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:17 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:17 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:16 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1350-1351].eqiad.wmnet
* 11:16 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1350-1351].eqiad.wmnet
* 11:15 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:15 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:15 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:15 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:14 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 11:14 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 11:13 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 11:13 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 11:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1172.eqiad.wmnet
* 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1171.eqiad.wmnet
* 11:13 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 11:13 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 11:12 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 11:11 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 11:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P89681 and previous config saved to /var/cache/conftool/dbconfig/20260303-111003-marostegui.json
* 11:09 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:08 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 11:08 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 11:07 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 11:07 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 11:06 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 11:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89680 and previous config saved to /var/cache/conftool/dbconfig/20260303-110551-marostegui.json
* 11:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2210.codfw.wmnet with reason: Maintenance
* 11:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89679 and previous config saved to /var/cache/conftool/dbconfig/20260303-110527-marostegui.json
* 10:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1171.eqiad.wmnet
* 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1170.eqiad.wmnet
* 10:57 slyngshede@dns1004: END - running authdns-update
* 10:55 slyngshede@dns1004: START - running authdns-update
* 10:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P89678 and previous config saved to /var/cache/conftool/dbconfig/20260303-105455-marostegui.json
* 10:54 hashar@deploy2002: Finished deploy [gerrit/gerrit@12177b1]: wm-checks-api: add tag for Selenium jobs (duration: 00m 13s)
* 10:54 hashar@deploy2002: Started deploy [gerrit/gerrit@12177b1]: wm-checks-api: add tag for Selenium jobs
* 10:51 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 10:51 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 10:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P89677 and previous config saved to /var/cache/conftool/dbconfig/20260303-105020-marostegui.json
* 10:47 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1170.eqiad.wmnet
* 10:45 fabfur: start upgrading haproxy to 3.0 on A:cp-eqsin ([[phab:T417253|T417253]])
* 10:43 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 10:41 moritzm: installing Django security updates
* 10:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89676 and previous config saved to /var/cache/conftool/dbconfig/20260303-103947-marostegui.json
* 10:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P89675 and previous config saved to /var/cache/conftool/dbconfig/20260303-103512-marostegui.json
* 10:34 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:33 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:31 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 10:25 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 10:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89674 and previous config saved to /var/cache/conftool/dbconfig/20260303-102004-marostegui.json
* 10:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1238 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89673 and previous config saved to /var/cache/conftool/dbconfig/20260303-101800-marostegui.json
* 10:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1238.eqiad.wmnet with reason: Maintenance
* 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89672 and previous config saved to /var/cache/conftool/dbconfig/20260303-101747-marostegui.json
* 09:57 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89670 and previous config saved to /var/cache/conftool/dbconfig/20260303-095655-marostegui.json
* 09:56 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 09:53 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:51 moritzm: installing qemu security updates
* 09:48 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 09:48 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 09:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P89669 and previous config saved to /var/cache/conftool/dbconfig/20260303-094732-marostegui.json
* 09:47 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 09:47 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 09:45 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 09:45 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 09:44 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 09:44 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 09:44 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 09:44 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1176.eqiad.wmnet with reason: host reimage
* 09:43 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 09:40 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 09:38 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1176.eqiad.wmnet with reason: host reimage
* 09:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2199.codfw.wmnet with reason: Maintenance
* 09:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89668 and previous config saved to /var/cache/conftool/dbconfig/20260303-093542-marostegui.json
* 09:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89667 and previous config saved to /var/cache/conftool/dbconfig/20260303-093224-marostegui.json
* 09:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 09:23 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 09:23 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1176.eqiad.wmnet with OS trixie
* 09:21 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 09:20 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 09:20 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 09:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P89666 and previous config saved to /var/cache/conftool/dbconfig/20260303-092034-marostegui.json
* 09:19 arnaudb@dns1004: END - running authdns-update
* 09:18 arnaudb@dns1004: START - running authdns-update
* 09:17 moritzm: installing libbpf updates from Bookworm point release
* 09:08 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89665 and previous config saved to /var/cache/conftool/dbconfig/20260303-090818-marostegui.json
* 09:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on 6 hosts with reason: Maintenance
* 09:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1221.eqiad.wmnet with reason: Maintenance
* 09:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89664 and previous config saved to /var/cache/conftool/dbconfig/20260303-090731-marostegui.json
* 09:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P89663 and previous config saved to /var/cache/conftool/dbconfig/20260303-090526-marostegui.json
* 08:54 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
* 08:53 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
* 08:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P89662 and previous config saved to /var/cache/conftool/dbconfig/20260303-085224-marostegui.json
* 08:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89661 and previous config saved to /var/cache/conftool/dbconfig/20260303-085019-marostegui.json
* 08:47 moritzm: powercycling lvs1013
* 08:41 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 08:41 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 08:37 fabfur: start upgrading haproxy to 3.0 on A:cp-ulsfo ([[phab:T417253|T417253]])
* 08:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P89660 and previous config saved to /var/cache/conftool/dbconfig/20260303-083716-marostegui.json
* 08:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 08:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:31 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 08:30 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 08:28 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
* 08:27 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
* 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89659 and previous config saved to /var/cache/conftool/dbconfig/20260303-082424-marostegui.json
* 08:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89658 and previous config saved to /var/cache/conftool/dbconfig/20260303-082400-marostegui.json
* 08:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89657 and previous config saved to /var/cache/conftool/dbconfig/20260303-082209-marostegui.json
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P89656 and previous config saved to /var/cache/conftool/dbconfig/20260303-080853-marostegui.json
* 08:07 moritzm: installing PAM security updates on Bookworm
* 07:55 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1199 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89655 and previous config saved to /var/cache/conftool/dbconfig/20260303-075526-marostegui.json
* 07:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1199.eqiad.wmnet with reason: Maintenance
* 07:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89654 and previous config saved to /var/cache/conftool/dbconfig/20260303-075502-marostegui.json
* 07:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P89653 and previous config saved to /var/cache/conftool/dbconfig/20260303-075345-marostegui.json
* 07:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P89652 and previous config saved to /var/cache/conftool/dbconfig/20260303-073955-marostegui.json
* 07:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89651 and previous config saved to /var/cache/conftool/dbconfig/20260303-073838-marostegui.json
* 07:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P89650 and previous config saved to /var/cache/conftool/dbconfig/20260303-072447-marostegui.json
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89649 and previous config saved to /var/cache/conftool/dbconfig/20260303-071054-marostegui.json
* 07:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89648 and previous config saved to /var/cache/conftool/dbconfig/20260303-071029-marostegui.json
* 07:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89647 and previous config saved to /var/cache/conftool/dbconfig/20260303-070940-marostegui.json
* 06:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P89646 and previous config saved to /var/cache/conftool/dbconfig/20260303-065523-marostegui.json
* 06:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1190 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89645 and previous config saved to /var/cache/conftool/dbconfig/20260303-064405-marostegui.json
* 06:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P89644 and previous config saved to /var/cache/conftool/dbconfig/20260303-064015-marostegui.json
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2240 gradually with 4 steps - repool after schema change
* 06:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89642 and previous config saved to /var/cache/conftool/dbconfig/20260303-062507-marostegui.json
* 05:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2147 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89639 and previous config saved to /var/cache/conftool/dbconfig/20260303-055834-marostegui.json
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2147.codfw.wmnet with reason: Maintenance
* 05:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 05:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 05:48 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2240 gradually with 4 steps - repool after schema change
* 05:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 05:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.15 (duration: 01m 10s)
* 04:43 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.18 refs [[phab:T413809|T413809]] (duration: 39m 43s)
* 04:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 03:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 03:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89637 and previous config saved to /var/cache/conftool/dbconfig/20260303-035746-marostegui.json
* 03:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P89636 and previous config saved to /var/cache/conftool/dbconfig/20260303-034239-marostegui.json
* 03:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P89635 and previous config saved to /var/cache/conftool/dbconfig/20260303-032731-marostegui.json
* 03:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89634 and previous config saved to /var/cache/conftool/dbconfig/20260303-031224-marostegui.json
* 03:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1251 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89633 and previous config saved to /var/cache/conftool/dbconfig/20260303-030217-marostegui.json
* 03:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1251.eqiad.wmnet with reason: Maintenance
* 02:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1240.eqiad.wmnet with reason: Maintenance
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 00s)
* 02:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 02:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89632 and previous config saved to /var/cache/conftool/dbconfig/20260303-020817-marostegui.json
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P89631 and previous config saved to /var/cache/conftool/dbconfig/20260303-015309-marostegui.json
* 01:42 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mwlog2003.codfw.wmnet with OS trixie
* 01:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P89630 and previous config saved to /var/cache/conftool/dbconfig/20260303-013802-marostegui.json
* 01:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89629 and previous config saved to /var/cache/conftool/dbconfig/20260303-013719-marostegui.json
* 01:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89628 and previous config saved to /var/cache/conftool/dbconfig/20260303-012254-marostegui.json
* 01:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P89627 and previous config saved to /var/cache/conftool/dbconfig/20260303-012211-marostegui.json
* 01:19 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwlog2003.codfw.wmnet with reason: host reimage
* 01:11 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mwlog2003.codfw.wmnet with reason: host reimage
* 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1235 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89626 and previous config saved to /var/cache/conftool/dbconfig/20260303-011151-marostegui.json
* 01:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1235.eqiad.wmnet with reason: Maintenance
* 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89625 and previous config saved to /var/cache/conftool/dbconfig/20260303-011128-marostegui.json
* 01:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P89624 and previous config saved to /var/cache/conftool/dbconfig/20260303-010703-marostegui.json
* 00:59 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247189{{!}}Revert "ImageListPager: Properly support file schema migration read new"]] (duration: 08m 12s)
* 00:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P89623 and previous config saved to /var/cache/conftool/dbconfig/20260303-005620-marostegui.json
* 00:56 zabe@deploy2002: zabe: Continuing with sync
* 00:54 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog2003.codfw.wmnet with OS trixie
* 00:53 zabe@deploy2002: zabe: Backport for [[gerrit:1247189{{!}}Revert "ImageListPager: Properly support file schema migration read new"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mwlog2003.codfw.wmnet with OS trixie
* 00:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89622 and previous config saved to /var/cache/conftool/dbconfig/20260303-005156-marostegui.json
* 00:51 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1247189{{!}}Revert "ImageListPager: Properly support file schema migration read new"]]
* 00:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P89621 and previous config saved to /var/cache/conftool/dbconfig/20260303-004112-marostegui.json
* 00:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2216 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89620 and previous config saved to /var/cache/conftool/dbconfig/20260303-004056-marostegui.json
* 00:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2216.codfw.wmnet with reason: Maintenance
* 00:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89619 and previous config saved to /var/cache/conftool/dbconfig/20260303-004033-marostegui.json
* 00:31 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mwlog1003.eqiad.wmnet with OS trixie
* 00:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89618 and previous config saved to /var/cache/conftool/dbconfig/20260303-002604-marostegui.json
* 00:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P89617 and previous config saved to /var/cache/conftool/dbconfig/20260303-002525-marostegui.json
* 00:20 zabe@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 00:18 zabe@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 00:18 zabe@deploy2002: Finished scap sync-world: [[phab:T418327|T418327]] (duration: 05m 01s)
* 00:15 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1234 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89616 and previous config saved to /var/cache/conftool/dbconfig/20260303-001504-marostegui.json
* 00:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1234.eqiad.wmnet with reason: Maintenance
* 00:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89615 and previous config saved to /var/cache/conftool/dbconfig/20260303-001440-marostegui.json
* 00:13 zabe@deploy2002: Started scap sync-world: [[phab:T418327|T418327]]
* 00:11 zabe@deploy2002: zabe: Continuing with sync
* 00:10 zabe@deploy2002: zabe: Backport for [[gerrit:1247068{{!}}ImageListPager: Properly support file schema migration read new (T418327)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P89614 and previous config saved to /var/cache/conftool/dbconfig/20260303-001018-marostegui.json
* 00:08 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1247068{{!}}ImageListPager: Properly support file schema migration read new (T418327)]]
== 2026-03-02 ==
* 23:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P89613 and previous config saved to /var/cache/conftool/dbconfig/20260302-235933-marostegui.json
* 23:58 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240320{{!}}Stop writing to il_to on testwiki (T415787)]] (duration: 06m 02s)
* 23:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89612 and previous config saved to /var/cache/conftool/dbconfig/20260302-235511-marostegui.json
* 23:54 zabe@deploy2002: zabe: Continuing with sync
* 23:53 zabe@deploy2002: zabe: Backport for [[gerrit:1240320{{!}}Stop writing to il_to on testwiki (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:52 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1240320{{!}}Stop writing to il_to on testwiki (T415787)]]
* 23:51 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp2058.codfw.wmnet with reason: dcops troubleshooting for [[phab:T418527|T418527]]
* 23:50 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1246880{{!}}multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460)]] (duration: 07m 10s)
* 23:47 zabe@deploy2002: zabe: Continuing with sync
* 23:45 zabe@deploy2002: zabe: Backport for [[gerrit:1246880{{!}}multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P89611 and previous config saved to /var/cache/conftool/dbconfig/20260302-234425-marostegui.json
* 23:44 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog2003.codfw.wmnet with OS trixie
* 23:43 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89610 and previous config saved to /var/cache/conftool/dbconfig/20260302-234350-marostegui.json
* 23:43 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1246880{{!}}multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460)]]
* 23:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2203.codfw.wmnet with reason: Maintenance
* 23:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2202.codfw.wmnet with reason: Maintenance
* 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89609 and previous config saved to /var/cache/conftool/dbconfig/20260302-233517-marostegui.json
* 23:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89608 and previous config saved to /var/cache/conftool/dbconfig/20260302-232918-marostegui.json
* 23:25 dwisehaupt@dns1006: END - running authdns-update
* 23:24 dwisehaupt@dns1006: START - running authdns-update
* 23:23 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwlog1003.eqiad.wmnet with reason: host reimage
* 23:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P89607 and previous config saved to /var/cache/conftool/dbconfig/20260302-232009-marostegui.json
* 23:18 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mwlog1003.eqiad.wmnet with reason: host reimage
* 23:17 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1232 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89606 and previous config saved to /var/cache/conftool/dbconfig/20260302-231723-marostegui.json
* 23:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1232.eqiad.wmnet with reason: Maintenance
* 23:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89605 and previous config saved to /var/cache/conftool/dbconfig/20260302-231658-marostegui.json
* 23:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P89604 and previous config saved to /var/cache/conftool/dbconfig/20260302-230502-marostegui.json
* 23:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P89603 and previous config saved to /var/cache/conftool/dbconfig/20260302-230151-marostegui.json
* 22:57 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog1003.eqiad.wmnet with OS trixie
* 22:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89602 and previous config saved to /var/cache/conftool/dbconfig/20260302-224954-marostegui.json
* 22:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P89601 and previous config saved to /var/cache/conftool/dbconfig/20260302-224643-marostegui.json
* 22:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2188 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89600 and previous config saved to /var/cache/conftool/dbconfig/20260302-223612-marostegui.json
* 22:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2188.codfw.wmnet with reason: Maintenance
* 22:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89599 and previous config saved to /var/cache/conftool/dbconfig/20260302-223548-marostegui.json
* 22:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89598 and previous config saved to /var/cache/conftool/dbconfig/20260302-223135-marostegui.json
* 22:21 maryum: Deployed security fix for [[phab:T418179|T418179]]
* 22:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P89597 and previous config saved to /var/cache/conftool/dbconfig/20260302-222041-marostegui.json
* 22:19 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89596 and previous config saved to /var/cache/conftool/dbconfig/20260302-221938-marostegui.json
* 22:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 22:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89595 and previous config saved to /var/cache/conftool/dbconfig/20260302-221925-marostegui.json
* 22:10 aaron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1242613{{!}}Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470)]] (duration: 06m 39s)
* 22:06 aaron@deploy2002: aaron: Continuing with sync
* 22:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P89594 and previous config saved to /var/cache/conftool/dbconfig/20260302-220533-marostegui.json
* 22:05 aaron@deploy2002: aaron: Backport for [[gerrit:1242613{{!}}Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P89593 and previous config saved to /var/cache/conftool/dbconfig/20260302-220418-marostegui.json
* 22:03 aaron@deploy2002: Started scap sync-world: Backport for [[gerrit:1242613{{!}}Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470)]]
* 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-backup2003.codfw.wmnet with OS trixie
* 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-backup2004.codfw.wmnet with OS trixie
* 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 22:03 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 22:02 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 22:01 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247149{{!}}ApiCSPReport: Use structured logging for CSP reports]] (duration: 08m 19s)
* 21:57 catrope@deploy2002: catrope: Continuing with sync
* 21:56 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 21:55 catrope@deploy2002: catrope: Backport for [[gerrit:1247149{{!}}ApiCSPReport: Use structured logging for CSP reports]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:53 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1247149{{!}}ApiCSPReport: Use structured logging for CSP reports]]
* 21:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89592 and previous config saved to /var/cache/conftool/dbconfig/20260302-215025-marostegui.json
* 21:50 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2043.codfw.wmnet with reason: These are test instances, failing should not notif
* 21:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P89591 and previous config saved to /var/cache/conftool/dbconfig/20260302-214910-marostegui.json
* 21:48 inflatador: bking@desktop restarting wdqs codfw to clear ProbeDown alerts
* 21:43 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cp2043.codfw.wmnet
* 21:42 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup2004.codfw.wmnet with reason: host reimage
* 21:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89590 and previous config saved to /var/cache/conftool/dbconfig/20260302-213957-marostegui.json
* 21:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 21:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89589 and previous config saved to /var/cache/conftool/dbconfig/20260302-213934-marostegui.json
* 21:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup2003.codfw.wmnet with reason: host reimage
* 21:36 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Testing removal of OpenJDK 8 support - eevans@cumin1003
* 21:34 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1226024{{!}}Add Comments namespace for shnwikinews (T414403)]] (duration: 07m 07s)
* 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89588 and previous config saved to /var/cache/conftool/dbconfig/20260302-213402-marostegui.json
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup2004.codfw.wmnet with reason: host reimage
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup2003.codfw.wmnet with reason: host reimage
* 21:30 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp2043.codfw.wmnet
* 21:30 catrope@deploy2002: shivaanshsingh, catrope: Continuing with sync
* 21:29 catrope@deploy2002: shivaanshsingh, catrope: Backport for [[gerrit:1226024{{!}}Add Comments namespace for shnwikinews (T414403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:27 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1226024{{!}}Add Comments namespace for shnwikinews (T414403)]]
* 21:24 kemayo@deploy2002: Finished scap sync-world: Backport for [[gerrit:1243990{{!}}Suggestion Mode: add values for suggestion feedback properties (T401739)]], [[gerrit:1240721{{!}}Stop PasteCheck A/B test (T417429)]] (duration: 10m 55s)
* 21:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P89587 and previous config saved to /var/cache/conftool/dbconfig/20260302-212426-marostegui.json
* 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89586 and previous config saved to /var/cache/conftool/dbconfig/20260302-212345-marostegui.json
* 21:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89585 and previous config saved to /var/cache/conftool/dbconfig/20260302-212321-marostegui.json
* 21:20 kemayo@deploy2002: esanders, kemayo, caro: Continuing with sync
* 21:17 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-backup2004.codfw.wmnet with OS trixie
* 21:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-backup2003.codfw.wmnet with OS trixie
* 21:16 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Testing removal of OpenJDK 8 support - eevans@cumin1003
* 21:16 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ms-backup2003']
* 21:15 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-backup2003']
* 21:15 kemayo@deploy2002: esanders, kemayo, caro: Backport for [[gerrit:1243990{{!}}Suggestion Mode: add values for suggestion feedback properties (T401739)]], [[gerrit:1240721{{!}}Stop PasteCheck A/B test (T417429)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:14 inflatador: bking@apt1002 reprepro --component thirdparty/opensearch3 update trixie-wikimedia [[phab:T418388|T418388]]
* 21:13 kemayo@deploy2002: Started scap sync-world: Backport for [[gerrit:1243990{{!}}Suggestion Mode: add values for suggestion feedback properties (T401739)]], [[gerrit:1240721{{!}}Stop PasteCheck A/B test (T417429)]]
* 21:12 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-backup2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:11 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-backup2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:10 dani@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247107{{!}}Undeploy Comparative Reader Research survey on eswiki (T417834)]], [[gerrit:1247105{{!}}Undeploy Comparative Reader Research survey on enwiki (T417829)]] (duration: 06m 52s)
* 21:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P89584 and previous config saved to /var/cache/conftool/dbconfig/20260302-210919-marostegui.json
* 21:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P89583 and previous config saved to /var/cache/conftool/dbconfig/20260302-210813-marostegui.json
* 21:06 dani@deploy2002: dani: Continuing with sync
* 21:05 dani@deploy2002: dani: Backport for [[gerrit:1247107{{!}}Undeploy Comparative Reader Research survey on eswiki (T417834)]], [[gerrit:1247105{{!}}Undeploy Comparative Reader Research survey on enwiki (T417829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:03 dani@deploy2002: Started scap sync-world: Backport for [[gerrit:1247107{{!}}Undeploy Comparative Reader Research survey on eswiki (T417834)]], [[gerrit:1247105{{!}}Undeploy Comparative Reader Research survey on enwiki (T417829)]]
* 20:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-backup2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-backup2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-backup2004
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-backup2004
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-backup2003
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-backup2003
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-backup2003 to codfw - jhancock@cumin2002"
* 20:54 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-backup2003 to codfw - jhancock@cumin2002"
* 20:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89582 and previous config saved to /var/cache/conftool/dbconfig/20260302-205411-marostegui.json
* 20:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P89581 and previous config saved to /var/cache/conftool/dbconfig/20260302-205307-marostegui.json
* 20:50 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 20:42 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89580 and previous config saved to /var/cache/conftool/dbconfig/20260302-204136-marostegui.json
* 20:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 20:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89579 and previous config saved to /var/cache/conftool/dbconfig/20260302-204112-marostegui.json
* 20:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:39 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89578 and previous config saved to /var/cache/conftool/dbconfig/20260302-203759-marostegui.json
* 20:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:31 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89577 and previous config saved to /var/cache/conftool/dbconfig/20260302-202740-marostegui.json
* 20:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 20:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89576 and previous config saved to /var/cache/conftool/dbconfig/20260302-202716-marostegui.json
* 20:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P89575 and previous config saved to /var/cache/conftool/dbconfig/20260302-202604-marostegui.json
* 20:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P89574 and previous config saved to /var/cache/conftool/dbconfig/20260302-201209-marostegui.json
* 20:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P89573 and previous config saved to /var/cache/conftool/dbconfig/20260302-201057-marostegui.json
* 20:01 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 20:00 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 19:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P89572 and previous config saved to /var/cache/conftool/dbconfig/20260302-195702-marostegui.json
* 19:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89571 and previous config saved to /var/cache/conftool/dbconfig/20260302-195549-marostegui.json
* 19:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89570 and previous config saved to /var/cache/conftool/dbconfig/20260302-194435-marostegui.json
* 19:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 19:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89569 and previous config saved to /var/cache/conftool/dbconfig/20260302-194411-marostegui.json
* 19:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89568 and previous config saved to /var/cache/conftool/dbconfig/20260302-194155-marostegui.json
* 19:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89566 and previous config saved to /var/cache/conftool/dbconfig/20260302-193119-marostegui.json
* 19:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 19:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 19:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89565 and previous config saved to /var/cache/conftool/dbconfig/20260302-193046-marostegui.json
* 19:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P89564 and previous config saved to /var/cache/conftool/dbconfig/20260302-192903-marostegui.json
* 19:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P89563 and previous config saved to /var/cache/conftool/dbconfig/20260302-191539-marostegui.json
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P89562 and previous config saved to /var/cache/conftool/dbconfig/20260302-191355-marostegui.json
* 19:12 dzahn@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:12 dzahn@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:04 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2095.codfw.wmnet with OS bullseye
* 19:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P89561 and previous config saved to /var/cache/conftool/dbconfig/20260302-190032-marostegui.json
* 18:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89560 and previous config saved to /var/cache/conftool/dbconfig/20260302-185848-marostegui.json
* 18:54 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 18:53 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89559 and previous config saved to /var/cache/conftool/dbconfig/20260302-184832-marostegui.json
* 18:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89558 and previous config saved to /var/cache/conftool/dbconfig/20260302-184808-marostegui.json
* 18:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89557 and previous config saved to /var/cache/conftool/dbconfig/20260302-184524-marostegui.json
* 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89556 and previous config saved to /var/cache/conftool/dbconfig/20260302-183449-marostegui.json
* 18:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1195.eqiad.wmnet with reason: Maintenance
* 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89555 and previous config saved to /var/cache/conftool/dbconfig/20260302-183425-marostegui.json
* 18:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P89554 and previous config saved to /var/cache/conftool/dbconfig/20260302-183300-marostegui.json
* 18:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P89553 and previous config saved to /var/cache/conftool/dbconfig/20260302-181918-marostegui.json
* 18:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P89552 and previous config saved to /var/cache/conftool/dbconfig/20260302-181753-marostegui.json
* 18:16 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 18:04 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P89551 and previous config saved to /var/cache/conftool/dbconfig/20260302-180411-marostegui.json
* 18:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89550 and previous config saved to /var/cache/conftool/dbconfig/20260302-180245-marostegui.json
* 18:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:53 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
* 17:53 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
* 17:52 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:52 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89549 and previous config saved to /var/cache/conftool/dbconfig/20260302-174917-marostegui.json
* 17:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 17:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 17:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89548 and previous config saved to /var/cache/conftool/dbconfig/20260302-174903-marostegui.json
* 17:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89547 and previous config saved to /var/cache/conftool/dbconfig/20260302-174854-marostegui.json
* 17:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
* 17:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:42 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:39 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89546 and previous config saved to /var/cache/conftool/dbconfig/20260302-173827-marostegui.json
* 17:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89545 and previous config saved to /var/cache/conftool/dbconfig/20260302-173803-marostegui.json
* 17:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 17:36 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 17:34 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 17:33 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 17:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P89544 and previous config saved to /var/cache/conftool/dbconfig/20260302-173347-marostegui.json
* 17:32 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.update-replication (exit_code=99)
* 17:32 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 17:24 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 17:23 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 17:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P89543 and previous config saved to /var/cache/conftool/dbconfig/20260302-172256-marostegui.json
* 17:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P89542 and previous config saved to /var/cache/conftool/dbconfig/20260302-171839-marostegui.json
* 17:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P89541 and previous config saved to /var/cache/conftool/dbconfig/20260302-170748-marostegui.json
* 17:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89540 and previous config saved to /var/cache/conftool/dbconfig/20260302-170331-marostegui.json
* 16:52 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2230.codfw.wmnet with OS trixie
* 16:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89539 and previous config saved to /var/cache/conftool/dbconfig/20260302-165240-marostegui.json
* 16:51 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2146 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89538 and previous config saved to /var/cache/conftool/dbconfig/20260302-165153-marostegui.json
* 16:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2146.codfw.wmnet with reason: Maintenance
* 16:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89537 and previous config saved to /var/cache/conftool/dbconfig/20260302-165129-marostegui.json
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89536 and previous config saved to /var/cache/conftool/dbconfig/20260302-164141-marostegui.json
* 16:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89535 and previous config saved to /var/cache/conftool/dbconfig/20260302-164118-marostegui.json
* 16:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P89534 and previous config saved to /var/cache/conftool/dbconfig/20260302-163622-marostegui.json
* 16:29 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2230.codfw.wmnet with reason: host reimage
* 16:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P89533 and previous config saved to /var/cache/conftool/dbconfig/20260302-162610-marostegui.json
* 16:21 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2230.codfw.wmnet with reason: host reimage
* 16:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P89532 and previous config saved to /var/cache/conftool/dbconfig/20260302-162115-marostegui.json
* 16:19 moritzm: installing PAM security updates on Bookworm
* 16:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P89531 and previous config saved to /var/cache/conftool/dbconfig/20260302-161102-marostegui.json
* 16:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89530 and previous config saved to /var/cache/conftool/dbconfig/20260302-160607-marostegui.json
* 16:05 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db2230.codfw.wmnet with OS trixie
* 15:56 moritzm: installing glibc bugfix updates from trixie point release
* 15:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89529 and previous config saved to /var/cache/conftool/dbconfig/20260302-155555-marostegui.json
* 15:55 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2145 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89528 and previous config saved to /var/cache/conftool/dbconfig/20260302-155527-marostegui.json
* 15:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2145.codfw.wmnet with reason: Maintenance
* 15:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 15:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1169.eqiad.wmnet
* 15:45 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89527 and previous config saved to /var/cache/conftool/dbconfig/20260302-154520-marostegui.json
* 15:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 15:38 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1169.eqiad.wmnet
* 15:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1167.eqiad.wmnet
* 15:32 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 15:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1209.eqiad.wmnet with reason: Maintenance
* 15:31 marostegui@cumin1003: dbctl commit (dc=all): 'Restore db1226 full weight after schema change', diff saved to https://phabricator.wikimedia.org/P89526 and previous config saved to /var/cache/conftool/dbconfig/20260302-153100-marostegui.json
* 15:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P89525 and previous config saved to /var/cache/conftool/dbconfig/20260302-152334-marostegui.json
* 15:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1167.eqiad.wmnet
* 15:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1166.eqiad.wmnet
* 15:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2198.codfw.wmnet with reason: Maintenance
* 15:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89524 and previous config saved to /var/cache/conftool/dbconfig/20260302-151838-marostegui.json
* 15:10 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1166.eqiad.wmnet
* 15:10 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1165.eqiad.wmnet
* 15:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P89523 and previous config saved to /var/cache/conftool/dbconfig/20260302-150826-marostegui.json
* 15:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P89522 and previous config saved to /var/cache/conftool/dbconfig/20260302-150330-marostegui.json
* 15:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1097.eqiad.wmnet with OS bullseye
* 15:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1165.eqiad.wmnet
* 14:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1164.eqiad.wmnet
* 14:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89520 and previous config saved to /var/cache/conftool/dbconfig/20260302-145318-marostegui.json
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1164.eqiad.wmnet
* 14:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1163.eqiad.wmnet
* 14:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P89519 and previous config saved to /var/cache/conftool/dbconfig/20260302-144823-marostegui.json
* 14:41 Lucas_WMDE: UTC afternoon backport+config window done
* 14:40 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247057{{!}}IPInfo: Set log level to "info" (T374718)]] (duration: 08m 01s)
* 14:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1163.eqiad.wmnet
* 14:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1162.eqiad.wmnet
* 14:36 lucaswerkmeister-wmde@deploy2002: kharlan, lucaswerkmeister-wmde: Continuing with sync
* 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1226 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89517 and previous config saved to /var/cache/conftool/dbconfig/20260302-143608-marostegui.json
* 14:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1226.eqiad.wmnet with reason: Maintenance
* 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89516 and previous config saved to /var/cache/conftool/dbconfig/20260302-143544-marostegui.json
* 14:34 lucaswerkmeister-wmde@deploy2002: kharlan, lucaswerkmeister-wmde: Backport for [[gerrit:1247057{{!}}IPInfo: Set log level to "info" (T374718)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89515 and previous config saved to /var/cache/conftool/dbconfig/20260302-143315-marostegui.json
* 14:32 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [[gerrit:1247057{{!}}IPInfo: Set log level to "info" (T374718)]]
* 14:31 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 14:30 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [[gerrit:1245364{{!}}Add configurations for graphql usage survey and its pipeline tests (T414476)]] (duration: 09m 44s)
* 14:27 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:26 lucaswerkmeister-wmde@deploy2002: itamar, lucaswerkmeister-wmde: Continuing with sync
* 14:26 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 14:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1162.eqiad.wmnet
* 14:25 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1161.eqiad.wmnet
* 14:23 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 14:22 lucaswerkmeister-wmde@deploy2002: itamar, lucaswerkmeister-wmde: Backport for [[gerrit:1245364{{!}}Add configurations for graphql usage survey and its pipeline tests (T414476)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:20 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [[gerrit:1245364{{!}}Add configurations for graphql usage survey and its pipeline tests (T414476)]]
* 14:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P89514 and previous config saved to /var/cache/conftool/dbconfig/20260302-142037-marostegui.json
* 14:19 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:18 lucaswerkmeister-wmde@deploy2002: mwscript-k8s job started: namespaceDupes lawiki --fix # [[phab:T418706|T418706]]
* 14:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89513 and previous config saved to /var/cache/conftool/dbconfig/20260302-141834-marostegui.json
* 14:18 elukey@puppetserver1001: conftool action : set/pooled=no; selector: name=ms-fe1013.eqiad.wmnet
* 14:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2195.codfw.wmnet with reason: Maintenance
* 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
* 14:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89512 and previous config saved to /var/cache/conftool/dbconfig/20260302-141810-marostegui.json
* 14:17 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247063{{!}}lawiki: add Adumbratio (draft) namespace (T418706)]] (duration: 07m 27s)
* 14:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
* 14:13 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, anzx: Continuing with sync
* 14:13 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 14:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1161.eqiad.wmnet
* 14:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1160.eqiad.wmnet
* 14:13 moritzm: installing libcap2 updates from Trixie point release
* 14:12 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, anzx: Backport for [[gerrit:1247063{{!}}lawiki: add Adumbratio (draft) namespace (T418706)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:11 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:11 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 14:10 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:10 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [[gerrit:1247063{{!}}lawiki: add Adumbratio (draft) namespace (T418706)]]
* 14:10 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 14:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1028.eqiad.wmnet
* 14:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 14:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P89511 and previous config saved to /var/cache/conftool/dbconfig/20260302-140529-marostegui.json
* 14:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1097.eqiad.wmnet with reason: host reimage
* 14:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P89510 and previous config saved to /var/cache/conftool/dbconfig/20260302-140302-marostegui.json
* 14:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1028.eqiad.wmnet
* 14:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1160.eqiad.wmnet
* 14:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 14:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1159.eqiad.wmnet
* 14:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1025.eqiad.wmnet
* 13:57 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1097.eqiad.wmnet with reason: host reimage
* 13:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1025.eqiad.wmnet
* 13:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89509 and previous config saved to /var/cache/conftool/dbconfig/20260302-135021-marostegui.json
* 13:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P89508 and previous config saved to /var/cache/conftool/dbconfig/20260302-134754-marostegui.json
* 13:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1159.eqiad.wmnet
* 13:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1158.eqiad.wmnet
* 13:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1097.eqiad.wmnet with OS bullseye
* 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1097
* 13:38 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1097
* 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt ms-be1097 - jclark@cumin1003"
* 13:37 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt ms-be1097 - jclark@cumin1003"
* 13:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1158.eqiad.wmnet
* 13:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1157.eqiad.wmnet
* 13:35 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 13:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1214 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89507 and previous config saved to /var/cache/conftool/dbconfig/20260302-133503-marostegui.json
* 13:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1214.eqiad.wmnet with reason: Maintenance
* 13:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89506 and previous config saved to /var/cache/conftool/dbconfig/20260302-133440-marostegui.json
* 13:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89505 and previous config saved to /var/cache/conftool/dbconfig/20260302-133247-marostegui.json
* 13:28 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 13:27 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:27 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1097
* 13:26 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1097
* 13:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1157.eqiad.wmnet
* 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1156.eqiad.wmnet
* 13:22 brouberol: Running `echo 'https://turnilo-next.wikimedia.org' {{!}} mwscript-k8s --attach -- purgeList.php`
* 13:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P89504 and previous config saved to /var/cache/conftool/dbconfig/20260302-131932-marostegui.json
* 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2181 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89503 and previous config saved to /var/cache/conftool/dbconfig/20260302-131653-marostegui.json
* 13:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2181.codfw.wmnet with reason: Maintenance
* 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89502 and previous config saved to /var/cache/conftool/dbconfig/20260302-131630-marostegui.json
* 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1024.eqiad.wmnet
* 13:14 moritzm: installing libcap2 updates from Bookworm point release
* 13:12 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1156.eqiad.wmnet
* 13:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1155.eqiad.wmnet
* 13:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1024.eqiad.wmnet
* 13:07 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 13:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P89500 and previous config saved to /var/cache/conftool/dbconfig/20260302-130424-marostegui.json
* 13:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P89499 and previous config saved to /var/cache/conftool/dbconfig/20260302-130122-marostegui.json
* 13:00 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 12:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2356.codfw.wmnet
* 12:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2356.codfw.wmnet
* 12:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1155.eqiad.wmnet
* 12:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1154.eqiad.wmnet
* 12:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89498 and previous config saved to /var/cache/conftool/dbconfig/20260302-124917-marostegui.json
* 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1154.eqiad.wmnet
* 12:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1153.eqiad.wmnet
* 12:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P89497 and previous config saved to /var/cache/conftool/dbconfig/20260302-124615-marostegui.json
* 12:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1153.eqiad.wmnet
* 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1152.eqiad.wmnet
* 12:32 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89494 and previous config saved to /var/cache/conftool/dbconfig/20260302-123253-marostegui.json
* 12:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1203.eqiad.wmnet with reason: Maintenance
* 12:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89493 and previous config saved to /var/cache/conftool/dbconfig/20260302-123229-marostegui.json
* 12:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89492 and previous config saved to /var/cache/conftool/dbconfig/20260302-123108-marostegui.json
* 12:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1152.eqiad.wmnet
* 12:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1151.eqiad.wmnet
* 12:23 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P89491 and previous config saved to /var/cache/conftool/dbconfig/20260302-121722-marostegui.json
* 12:15 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89490 and previous config saved to /var/cache/conftool/dbconfig/20260302-121525-marostegui.json
* 12:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 12:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89489 and previous config saved to /var/cache/conftool/dbconfig/20260302-121501-marostegui.json
* 12:12 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1151.eqiad.wmnet
* 12:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1150.eqiad.wmnet
* 12:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P89488 and previous config saved to /var/cache/conftool/dbconfig/20260302-120214-marostegui.json
* 12:00 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1150.eqiad.wmnet
* 11:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P89487 and previous config saved to /var/cache/conftool/dbconfig/20260302-115953-marostegui.json
* 11:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89486 and previous config saved to /var/cache/conftool/dbconfig/20260302-114706-marostegui.json
* 11:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P89485 and previous config saved to /var/cache/conftool/dbconfig/20260302-114446-marostegui.json
* 11:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1193 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89484 and previous config saved to /var/cache/conftool/dbconfig/20260302-113034-marostegui.json
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
* 11:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1193.eqiad.wmnet with reason: Maintenance
* 11:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89483 and previous config saved to /var/cache/conftool/dbconfig/20260302-113010-marostegui.json
* 11:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89482 and previous config saved to /var/cache/conftool/dbconfig/20260302-112937-marostegui.json
* 11:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
* 11:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P89481 and previous config saved to /var/cache/conftool/dbconfig/20260302-111502-marostegui.json
* 11:13 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2166 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89480 and previous config saved to /var/cache/conftool/dbconfig/20260302-111351-marostegui.json
* 11:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2166.codfw.wmnet with reason: Maintenance
* 11:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89479 and previous config saved to /var/cache/conftool/dbconfig/20260302-111327-marostegui.json
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 10:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P89478 and previous config saved to /var/cache/conftool/dbconfig/20260302-105955-marostegui.json
* 10:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P89477 and previous config saved to /var/cache/conftool/dbconfig/20260302-105818-marostegui.json
* 10:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 10:55 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:54 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 10:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 10:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 10:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 10:46 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru and A:cp - 3.0 upgrade ()
* 10:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89476 and previous config saved to /var/cache/conftool/dbconfig/20260302-104446-marostegui.json
* 10:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P89475 and previous config saved to /var/cache/conftool/dbconfig/20260302-104310-marostegui.json
* 10:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89474 and previous config saved to /var/cache/conftool/dbconfig/20260302-102825-marostegui.json
* 10:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1192.eqiad.wmnet with reason: Maintenance
* 10:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89473 and previous config saved to /var/cache/conftool/dbconfig/20260302-102800-marostegui.json
* 10:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P89472 and previous config saved to /var/cache/conftool/dbconfig/20260302-101252-marostegui.json
* 10:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2164 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89471 and previous config saved to /var/cache/conftool/dbconfig/20260302-101200-marostegui.json
* 10:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2164.codfw.wmnet with reason: Maintenance
* 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89470 and previous config saved to /var/cache/conftool/dbconfig/20260302-101135-marostegui.json
* 10:08 moritzm: installing intel-microcode bugfix updates on Bookworm hosts
* 09:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P89469 and previous config saved to /var/cache/conftool/dbconfig/20260302-095744-marostegui.json
* 09:57 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru and A:cp - 3.0 upgrade ()
* 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P89468 and previous config saved to /var/cache/conftool/dbconfig/20260302-095627-marostegui.json
* 09:55 fabfur: start upgrading haproxy to 3.0 on A:cp-text_magru ([[phab:T417253|T417253]])
* 09:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89467 and previous config saved to /var/cache/conftool/dbconfig/20260302-094236-marostegui.json
* 09:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P89466 and previous config saved to /var/cache/conftool/dbconfig/20260302-094118-marostegui.json
* 09:35 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:35 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:34 moritzm: installing gnu TLS security updates
* 09:34 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:33 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89465 and previous config saved to /var/cache/conftool/dbconfig/20260302-092610-marostegui.json
* 09:26 mlitn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1245265{{!}}Limit additional whitespace to sticky header version only (T416598)]] (duration: 11m 02s)
* 09:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89464 and previous config saved to /var/cache/conftool/dbconfig/20260302-092600-marostegui.json
* 09:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1178.eqiad.wmnet with reason: Maintenance
* 09:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89463 and previous config saved to /var/cache/conftool/dbconfig/20260302-092535-marostegui.json
* 09:21 mlitn@deploy2002: mlitn: Continuing with sync
* 09:16 mlitn@deploy2002: mlitn: Backport for [[gerrit:1245265{{!}}Limit additional whitespace to sticky header version only (T416598)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:15 mlitn@deploy2002: Started scap sync-world: Backport for [[gerrit:1245265{{!}}Limit additional whitespace to sticky header version only (T416598)]]
* 09:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P89462 and previous config saved to /var/cache/conftool/dbconfig/20260302-091027-marostegui.json
* 09:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2163 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89461 and previous config saved to /var/cache/conftool/dbconfig/20260302-091003-marostegui.json
* 09:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2163.codfw.wmnet with reason: Maintenance
* 09:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89460 and previous config saved to /var/cache/conftool/dbconfig/20260302-090938-marostegui.json
* 09:08 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1246904{{!}}HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477)]] (duration: 16m 09s)
* 09:02 kharlan@deploy2002: kharlan: Continuing with sync
* 08:57 kharlan@deploy2002: kharlan: Backport for [[gerrit:1246904{{!}}HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P89459 and previous config saved to /var/cache/conftool/dbconfig/20260302-085519-marostegui.json
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P89458 and previous config saved to /var/cache/conftool/dbconfig/20260302-085430-marostegui.json
* 08:51 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1246904{{!}}HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477)]]
* 08:48 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru and A:cp - 3.0 upgrade ()
* 08:47 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1029.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:45 moritzm: installing libxml2 security updates
* 08:44 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240672{{!}}Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485)]] (duration: 37m 12s)
* 08:42 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dbproxy1029.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89457 and previous config saved to /var/cache/conftool/dbconfig/20260302-084010-marostegui.json
* 08:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P89456 and previous config saved to /var/cache/conftool/dbconfig/20260302-083922-marostegui.json
* 08:31 kgraessle@deploy2002: kgraessle: Continuing with sync
* 08:30 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1240672{{!}}Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89455 and previous config saved to /var/cache/conftool/dbconfig/20260302-082414-marostegui.json
* 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1177 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89454 and previous config saved to /var/cache/conftool/dbconfig/20260302-082333-marostegui.json
* 08:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89453 and previous config saved to /var/cache/conftool/dbconfig/20260302-082309-marostegui.json
* 08:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbproxy1028.eqiad.wmnet with reason: Maintenance
* 08:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbproxy1029.eqiad.wmnet with reason: Maintenance
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89452 and previous config saved to /var/cache/conftool/dbconfig/20260302-080813-marostegui.json
* 08:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2161.codfw.wmnet with reason: Maintenance
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P89451 and previous config saved to /var/cache/conftool/dbconfig/20260302-080800-marostegui.json
* 08:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89450 and previous config saved to /var/cache/conftool/dbconfig/20260302-080748-marostegui.json
* 08:07 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1240672{{!}}Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485)]]
* 08:05 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru and A:cp - 3.0 upgrade ()
* 08:05 fabfur: start upgrading haproxy to 3.0 on A:cp-upload_magru ([[phab:T417253|T417253]])
* 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P89449 and previous config saved to /var/cache/conftool/dbconfig/20260302-075252-marostegui.json
* 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P89448 and previous config saved to /var/cache/conftool/dbconfig/20260302-075241-marostegui.json
* 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89447 and previous config saved to /var/cache/conftool/dbconfig/20260302-073745-marostegui.json
* 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P89446 and previous config saved to /var/cache/conftool/dbconfig/20260302-073732-marostegui.json
* 07:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89445 and previous config saved to /var/cache/conftool/dbconfig/20260302-072224-marostegui.json
* 07:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89444 and previous config saved to /var/cache/conftool/dbconfig/20260302-072058-marostegui.json
* 07:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 07:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89443 and previous config saved to /var/cache/conftool/dbconfig/20260302-070523-marostegui.json
* 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2154 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89442 and previous config saved to /var/cache/conftool/dbconfig/20260302-070512-marostegui.json
* 07:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2154.codfw.wmnet with reason: Maintenance
* 07:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89441 and previous config saved to /var/cache/conftool/dbconfig/20260302-070447-marostegui.json
* 07:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1244: After schema change
* 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P89439 and previous config saved to /var/cache/conftool/dbconfig/20260302-065014-marostegui.json
* 06:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P89438 and previous config saved to /var/cache/conftool/dbconfig/20260302-064938-marostegui.json
* 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P89436 and previous config saved to /var/cache/conftool/dbconfig/20260302-063506-marostegui.json
* 06:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P89435 and previous config saved to /var/cache/conftool/dbconfig/20260302-063430-marostegui.json
* 06:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89433 and previous config saved to /var/cache/conftool/dbconfig/20260302-061957-marostegui.json
* 06:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89432 and previous config saved to /var/cache/conftool/dbconfig/20260302-061922-marostegui.json
* 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: Maintenance
* 06:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1244: After schema change
* 06:15 marostegui@dns1004: END - running authdns-update
* 06:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2240 [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89430 and previous config saved to /var/cache/conftool/dbconfig/20260302-061428-marostegui.json
* 06:13 marostegui@dns1004: START - running authdns-update
* 06:13 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2179 to s4 primary and set section read-write [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89429 and previous config saved to /var/cache/conftool/dbconfig/20260302-061316-marostegui.json
* 06:12 marostegui@cumin1003: dbctl commit (dc=all): 'Set s4 codfw as read-only for maintenance - [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89428 and previous config saved to /var/cache/conftool/dbconfig/20260302-061252-marostegui.json
* 06:06 marostegui: Starting s4 codfw failover from db2240 to db2179 - [[phab:T418080|T418080]]
* 06:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 42 hosts with reason: Primary switchover s4 [[phab:T418080|T418080]]
* 06:03 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2179 with weight 0 [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89427 and previous config saved to /var/cache/conftool/dbconfig/20260302-060317-marostegui.json
* 06:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89426 and previous config saved to /var/cache/conftool/dbconfig/20260302-060317-marostegui.json
* 06:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 06:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 06:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2152 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89425 and previous config saved to /var/cache/conftool/dbconfig/20260302-060245-marostegui.json
* 06:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2152.codfw.wmnet with reason: Maintenance
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2220.codfw.wmnet with reason: Maintenance
* 02:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 13s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 00:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 00:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89424 and previous config saved to /var/cache/conftool/dbconfig/20260302-004950-marostegui.json
* 00:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P89423 and previous config saved to /var/cache/conftool/dbconfig/20260302-003441-marostegui.json
* 00:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P89422 and previous config saved to /var/cache/conftool/dbconfig/20260302-001933-marostegui.json
* 00:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89421 and previous config saved to /var/cache/conftool/dbconfig/20260302-000425-marostegui.json
* 00:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1253 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89420 and previous config saved to /var/cache/conftool/dbconfig/20260302-000208-marostegui.json
* 00:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1253.eqiad.wmnet with reason: Maintenance
* 00:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89419 and previous config saved to /var/cache/conftool/dbconfig/20260302-000143-marostegui.json
== 2026-03-01 ==
* 23:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P89418 and previous config saved to /var/cache/conftool/dbconfig/20260301-234635-marostegui.json
* 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89417 and previous config saved to /var/cache/conftool/dbconfig/20260301-233524-marostegui.json
* 23:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P89416 and previous config saved to /var/cache/conftool/dbconfig/20260301-233127-marostegui.json
* 23:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P89415 and previous config saved to /var/cache/conftool/dbconfig/20260301-232016-marostegui.json
* 23:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89414 and previous config saved to /var/cache/conftool/dbconfig/20260301-231619-marostegui.json
* 23:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89413 and previous config saved to /var/cache/conftool/dbconfig/20260301-231404-marostegui.json
* 23:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1236.eqiad.wmnet with reason: Maintenance
* 23:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89412 and previous config saved to /var/cache/conftool/dbconfig/20260301-231339-marostegui.json
* 23:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P89411 and previous config saved to /var/cache/conftool/dbconfig/20260301-230508-marostegui.json
* 22:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P89410 and previous config saved to /var/cache/conftool/dbconfig/20260301-225832-marostegui.json
* 22:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89409 and previous config saved to /var/cache/conftool/dbconfig/20260301-224959-marostegui.json
* 22:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2222 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89408 and previous config saved to /var/cache/conftool/dbconfig/20260301-224451-marostegui.json
* 22:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2222.codfw.wmnet with reason: Maintenance
* 22:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89407 and previous config saved to /var/cache/conftool/dbconfig/20260301-224426-marostegui.json
* 22:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P89406 and previous config saved to /var/cache/conftool/dbconfig/20260301-224324-marostegui.json
* 22:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P89405 and previous config saved to /var/cache/conftool/dbconfig/20260301-222919-marostegui.json
* 22:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89404 and previous config saved to /var/cache/conftool/dbconfig/20260301-222815-marostegui.json
* 22:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1231 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89403 and previous config saved to /var/cache/conftool/dbconfig/20260301-222600-marostegui.json
* 22:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1231.eqiad.wmnet with reason: Maintenance
* 22:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89402 and previous config saved to /var/cache/conftool/dbconfig/20260301-222536-marostegui.json
* 22:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P89401 and previous config saved to /var/cache/conftool/dbconfig/20260301-221410-marostegui.json
* 22:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P89400 and previous config saved to /var/cache/conftool/dbconfig/20260301-221027-marostegui.json
* 21:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89399 and previous config saved to /var/cache/conftool/dbconfig/20260301-215902-marostegui.json
* 21:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P89398 and previous config saved to /var/cache/conftool/dbconfig/20260301-215519-marostegui.json
* 21:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89397 and previous config saved to /var/cache/conftool/dbconfig/20260301-215404-marostegui.json
* 21:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2221.codfw.wmnet with reason: Maintenance
* 21:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89396 and previous config saved to /var/cache/conftool/dbconfig/20260301-215339-marostegui.json
* 21:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89395 and previous config saved to /var/cache/conftool/dbconfig/20260301-214011-marostegui.json
* 21:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P89394 and previous config saved to /var/cache/conftool/dbconfig/20260301-213831-marostegui.json
* 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89393 and previous config saved to /var/cache/conftool/dbconfig/20260301-213410-marostegui.json
* 21:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 21:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89392 and previous config saved to /var/cache/conftool/dbconfig/20260301-213346-marostegui.json
* 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P89391 and previous config saved to /var/cache/conftool/dbconfig/20260301-212323-marostegui.json
* 21:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P89390 and previous config saved to /var/cache/conftool/dbconfig/20260301-211837-marostegui.json
* 21:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89389 and previous config saved to /var/cache/conftool/dbconfig/20260301-210815-marostegui.json
* 21:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P89388 and previous config saved to /var/cache/conftool/dbconfig/20260301-210329-marostegui.json
* 21:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89387 and previous config saved to /var/cache/conftool/dbconfig/20260301-210309-marostegui.json
* 21:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2218.codfw.wmnet with reason: Maintenance
* 21:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89386 and previous config saved to /var/cache/conftool/dbconfig/20260301-210244-marostegui.json
* 20:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89385 and previous config saved to /var/cache/conftool/dbconfig/20260301-204820-marostegui.json
* 20:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P89384 and previous config saved to /var/cache/conftool/dbconfig/20260301-204736-marostegui.json
* 20:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1202 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89383 and previous config saved to /var/cache/conftool/dbconfig/20260301-204606-marostegui.json
* 20:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 20:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89382 and previous config saved to /var/cache/conftool/dbconfig/20260301-204541-marostegui.json
* 20:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P89381 and previous config saved to /var/cache/conftool/dbconfig/20260301-203227-marostegui.json
* 20:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P89380 and previous config saved to /var/cache/conftool/dbconfig/20260301-203033-marostegui.json
* 20:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89379 and previous config saved to /var/cache/conftool/dbconfig/20260301-201720-marostegui.json
* 20:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P89378 and previous config saved to /var/cache/conftool/dbconfig/20260301-201525-marostegui.json
* 20:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2208 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89377 and previous config saved to /var/cache/conftool/dbconfig/20260301-201212-marostegui.json
* 20:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Maintenance
* 20:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2200.codfw.wmnet with reason: Maintenance
* 20:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2198.codfw.wmnet with reason: Maintenance
* 20:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89376 and previous config saved to /var/cache/conftool/dbconfig/20260301-200422-marostegui.json
* 20:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89375 and previous config saved to /var/cache/conftool/dbconfig/20260301-200016-marostegui.json
* 19:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1194 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89374 and previous config saved to /var/cache/conftool/dbconfig/20260301-195803-marostegui.json
* 19:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 19:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89373 and previous config saved to /var/cache/conftool/dbconfig/20260301-195738-marostegui.json
* 19:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P89372 and previous config saved to /var/cache/conftool/dbconfig/20260301-194914-marostegui.json
* 19:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P89371 and previous config saved to /var/cache/conftool/dbconfig/20260301-194230-marostegui.json
* 19:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P89370 and previous config saved to /var/cache/conftool/dbconfig/20260301-193406-marostegui.json
* 19:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P89369 and previous config saved to /var/cache/conftool/dbconfig/20260301-192721-marostegui.json
* 19:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89368 and previous config saved to /var/cache/conftool/dbconfig/20260301-191858-marostegui.json
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89367 and previous config saved to /var/cache/conftool/dbconfig/20260301-191340-marostegui.json
* 19:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89366 and previous config saved to /var/cache/conftool/dbconfig/20260301-191315-marostegui.json
* 19:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89365 and previous config saved to /var/cache/conftool/dbconfig/20260301-191213-marostegui.json
* 19:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1191 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89364 and previous config saved to /var/cache/conftool/dbconfig/20260301-190958-marostegui.json
* 19:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 19:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89363 and previous config saved to /var/cache/conftool/dbconfig/20260301-190934-marostegui.json
* 18:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P89362 and previous config saved to /var/cache/conftool/dbconfig/20260301-185807-marostegui.json
* 18:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P89361 and previous config saved to /var/cache/conftool/dbconfig/20260301-185425-marostegui.json
* 18:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P89360 and previous config saved to /var/cache/conftool/dbconfig/20260301-184259-marostegui.json
* 18:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P89359 and previous config saved to /var/cache/conftool/dbconfig/20260301-183917-marostegui.json
* 18:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89358 and previous config saved to /var/cache/conftool/dbconfig/20260301-182750-marostegui.json
* 18:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89357 and previous config saved to /var/cache/conftool/dbconfig/20260301-182409-marostegui.json
* 18:22 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2168 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89356 and previous config saved to /var/cache/conftool/dbconfig/20260301-182238-marostegui.json
* 18:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 18:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89355 and previous config saved to /var/cache/conftool/dbconfig/20260301-182213-marostegui.json
* 18:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89354 and previous config saved to /var/cache/conftool/dbconfig/20260301-182153-marostegui.json
* 18:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 18:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 18:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89353 and previous config saved to /var/cache/conftool/dbconfig/20260301-181818-marostegui.json
* 18:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P89352 and previous config saved to /var/cache/conftool/dbconfig/20260301-180705-marostegui.json
* 18:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P89351 and previous config saved to /var/cache/conftool/dbconfig/20260301-180310-marostegui.json
* 17:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P89350 and previous config saved to /var/cache/conftool/dbconfig/20260301-175157-marostegui.json
* 17:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P89349 and previous config saved to /var/cache/conftool/dbconfig/20260301-174802-marostegui.json
* 17:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89348 and previous config saved to /var/cache/conftool/dbconfig/20260301-173649-marostegui.json
* 17:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89347 and previous config saved to /var/cache/conftool/dbconfig/20260301-173253-marostegui.json
* 17:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89346 and previous config saved to /var/cache/conftool/dbconfig/20260301-173134-marostegui.json
* 17:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 17:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89345 and previous config saved to /var/cache/conftool/dbconfig/20260301-173110-marostegui.json
* 17:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89344 and previous config saved to /var/cache/conftool/dbconfig/20260301-172742-marostegui.json
* 17:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 17:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89343 and previous config saved to /var/cache/conftool/dbconfig/20260301-172717-marostegui.json
* 17:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P89342 and previous config saved to /var/cache/conftool/dbconfig/20260301-171602-marostegui.json
* 17:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P89341 and previous config saved to /var/cache/conftool/dbconfig/20260301-171210-marostegui.json
* 17:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P89340 and previous config saved to /var/cache/conftool/dbconfig/20260301-170053-marostegui.json
* 16:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P89339 and previous config saved to /var/cache/conftool/dbconfig/20260301-165701-marostegui.json
* 16:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89338 and previous config saved to /var/cache/conftool/dbconfig/20260301-164545-marostegui.json
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89337 and previous config saved to /var/cache/conftool/dbconfig/20260301-164153-marostegui.json
* 16:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2150 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89336 and previous config saved to /var/cache/conftool/dbconfig/20260301-164022-marostegui.json
* 16:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2150.codfw.wmnet with reason: Maintenance
* 16:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89335 and previous config saved to /var/cache/conftool/dbconfig/20260301-163938-marostegui.json
* 16:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 16:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 16:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1230.eqiad.wmnet with reason: Maintenance
* 16:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 12:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89334 and previous config saved to /var/cache/conftool/dbconfig/20260301-122201-marostegui.json
* 12:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P89333 and previous config saved to /var/cache/conftool/dbconfig/20260301-120652-marostegui.json
* 11:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P89332 and previous config saved to /var/cache/conftool/dbconfig/20260301-115144-marostegui.json
* 11:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89331 and previous config saved to /var/cache/conftool/dbconfig/20260301-113636-marostegui.json
* 11:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2228 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89330 and previous config saved to /var/cache/conftool/dbconfig/20260301-113156-marostegui.json
* 11:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2228.codfw.wmnet with reason: Maintenance
* 11:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89329 and previous config saved to /var/cache/conftool/dbconfig/20260301-113131-marostegui.json
* 11:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 11:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 11:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1216.eqiad.wmnet with reason: Maintenance
* 11:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89328 and previous config saved to /var/cache/conftool/dbconfig/20260301-111658-marostegui.json
* 11:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P89327 and previous config saved to /var/cache/conftool/dbconfig/20260301-111622-marostegui.json
* 11:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P89326 and previous config saved to /var/cache/conftool/dbconfig/20260301-110151-marostegui.json
* 11:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P89325 and previous config saved to /var/cache/conftool/dbconfig/20260301-110114-marostegui.json
* 10:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P89324 and previous config saved to /var/cache/conftool/dbconfig/20260301-104642-marostegui.json
* 10:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89323 and previous config saved to /var/cache/conftool/dbconfig/20260301-104606-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2223 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89322 and previous config saved to /var/cache/conftool/dbconfig/20260301-104024-marostegui.json
* 10:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2223.codfw.wmnet with reason: Maintenance
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89321 and previous config saved to /var/cache/conftool/dbconfig/20260301-103958-marostegui.json
* 10:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89320 and previous config saved to /var/cache/conftool/dbconfig/20260301-103134-marostegui.json
* 10:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89319 and previous config saved to /var/cache/conftool/dbconfig/20260301-102727-marostegui.json
* 10:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1210.eqiad.wmnet with reason: Maintenance
* 10:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89318 and previous config saved to /var/cache/conftool/dbconfig/20260301-102702-marostegui.json
* 10:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P89317 and previous config saved to /var/cache/conftool/dbconfig/20260301-102450-marostegui.json
* 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P89316 and previous config saved to /var/cache/conftool/dbconfig/20260301-101154-marostegui.json
* 10:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P89315 and previous config saved to /var/cache/conftool/dbconfig/20260301-100942-marostegui.json
* 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P89314 and previous config saved to /var/cache/conftool/dbconfig/20260301-095645-marostegui.json
* 09:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89313 and previous config saved to /var/cache/conftool/dbconfig/20260301-095434-marostegui.json
* 09:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2211 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89312 and previous config saved to /var/cache/conftool/dbconfig/20260301-094847-marostegui.json
* 09:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2211.codfw.wmnet with reason: Maintenance
* 09:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2201.codfw.wmnet with reason: Maintenance
* 09:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89311 and previous config saved to /var/cache/conftool/dbconfig/20260301-094432-marostegui.json
* 09:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89310 and previous config saved to /var/cache/conftool/dbconfig/20260301-094137-marostegui.json
* 09:38 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1207 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89309 and previous config saved to /var/cache/conftool/dbconfig/20260301-093835-marostegui.json
* 09:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1207.eqiad.wmnet with reason: Maintenance
* 09:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89308 and previous config saved to /var/cache/conftool/dbconfig/20260301-093810-marostegui.json
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P89307 and previous config saved to /var/cache/conftool/dbconfig/20260301-092923-marostegui.json
* 09:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P89306 and previous config saved to /var/cache/conftool/dbconfig/20260301-092302-marostegui.json
* 09:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P89305 and previous config saved to /var/cache/conftool/dbconfig/20260301-091415-marostegui.json
* 09:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P89304 and previous config saved to /var/cache/conftool/dbconfig/20260301-090754-marostegui.json
* 08:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89303 and previous config saved to /var/cache/conftool/dbconfig/20260301-085907-marostegui.json
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89302 and previous config saved to /var/cache/conftool/dbconfig/20260301-085427-marostegui.json
* 08:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2192.codfw.wmnet with reason: Maintenance
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89301 and previous config saved to /var/cache/conftool/dbconfig/20260301-085403-marostegui.json
* 08:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89300 and previous config saved to /var/cache/conftool/dbconfig/20260301-085246-marostegui.json
* 08:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1200 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89299 and previous config saved to /var/cache/conftool/dbconfig/20260301-084952-marostegui.json
* 08:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1200.eqiad.wmnet with reason: Maintenance
* 08:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89298 and previous config saved to /var/cache/conftool/dbconfig/20260301-084928-marostegui.json
* 08:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P89297 and previous config saved to /var/cache/conftool/dbconfig/20260301-083855-marostegui.json
* 08:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P89296 and previous config saved to /var/cache/conftool/dbconfig/20260301-083420-marostegui.json
* 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P89295 and previous config saved to /var/cache/conftool/dbconfig/20260301-082346-marostegui.json
* 08:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P89294 and previous config saved to /var/cache/conftool/dbconfig/20260301-081912-marostegui.json
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89293 and previous config saved to /var/cache/conftool/dbconfig/20260301-080838-marostegui.json
* 08:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89292 and previous config saved to /var/cache/conftool/dbconfig/20260301-080404-marostegui.json
* 08:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 08:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89291 and previous config saved to /var/cache/conftool/dbconfig/20260301-080341-marostegui.json
* 08:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1185 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89290 and previous config saved to /var/cache/conftool/dbconfig/20260301-080110-marostegui.json
* 08:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1185.eqiad.wmnet with reason: Maintenance
* 08:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89289 and previous config saved to /var/cache/conftool/dbconfig/20260301-080044-marostegui.json
* 07:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P89288 and previous config saved to /var/cache/conftool/dbconfig/20260301-074833-marostegui.json
* 07:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P89287 and previous config saved to /var/cache/conftool/dbconfig/20260301-074536-marostegui.json
* 07:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P89286 and previous config saved to /var/cache/conftool/dbconfig/20260301-073324-marostegui.json
* 07:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P89285 and previous config saved to /var/cache/conftool/dbconfig/20260301-073028-marostegui.json
* 07:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89284 and previous config saved to /var/cache/conftool/dbconfig/20260301-071816-marostegui.json
* 07:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89283 and previous config saved to /var/cache/conftool/dbconfig/20260301-071521-marostegui.json
* 07:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2171 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89282 and previous config saved to /var/cache/conftool/dbconfig/20260301-071226-marostegui.json
* 07:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 07:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89281 and previous config saved to /var/cache/conftool/dbconfig/20260301-071201-marostegui.json
* 07:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89280 and previous config saved to /var/cache/conftool/dbconfig/20260301-071113-marostegui.json
* 07:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89279 and previous config saved to /var/cache/conftool/dbconfig/20260301-071040-marostegui.json
* 06:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P89278 and previous config saved to /var/cache/conftool/dbconfig/20260301-065653-marostegui.json
* 06:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P89277 and previous config saved to /var/cache/conftool/dbconfig/20260301-065531-marostegui.json
* 06:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P89276 and previous config saved to /var/cache/conftool/dbconfig/20260301-064145-marostegui.json
* 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P89275 and previous config saved to /var/cache/conftool/dbconfig/20260301-064023-marostegui.json
* 06:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89274 and previous config saved to /var/cache/conftool/dbconfig/20260301-062636-marostegui.json
* 06:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89273 and previous config saved to /var/cache/conftool/dbconfig/20260301-062515-marostegui.json
* 06:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89272 and previous config saved to /var/cache/conftool/dbconfig/20260301-062108-marostegui.json
* 06:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1159.eqiad.wmnet with reason: Maintenance
* 06:20 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89271 and previous config saved to /var/cache/conftool/dbconfig/20260301-062047-marostegui.json
* 06:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
* 02:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 00s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
2z9mo7yi1ao4z6ab53kf9sqs3gons2u
2396609
2396608
2026-03-29T02:07:02Z
Stashbot
7414
mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 13s)
2396609
wikitext
text/x-wiki
== 2026-03-29 ==
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 13s)
* 02:00 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
== 2026-03-28 ==
* 14:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421398|T421398]]
* 14:48 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421398|T421398]]
* 14:16 mutante: releases1003 - re-enabled puppet which was disabled due to [[phab:T418109|T418109]] but should not have been disabled during switch of the deployment server; leading to [[phab:T421532|T421532]]
== 2026-03-27 ==
* 18:11 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 18:00 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:50 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:40 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:39 tchin@deploy1003: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:39 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:39 tchin@deploy1003: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 17:38 tchin@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:37 tchin@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 17:37 tchin@deploy1003: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 17:35 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 17:34 tchin@deploy1003: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 17:34 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:30 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:30 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:24 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:19 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:15 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 17:04 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:55 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:50 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:47 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:42 dancy@deploy1003: Finished deploy [releng/jenkins-deploy@31ace7e] (releasing): (no justification provided) (duration: 01m 18s)
* 16:41 dancy@deploy1003: Started deploy [releng/jenkins-deploy@31ace7e] (releasing): (no justification provided)
* 16:37 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:36 jhathaway@cumin2002: END (ERROR) - Cookbook sre.hosts.provision (exit_code=97) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:27 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:22 jhathaway@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:13 brouberol@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 16:12 brouberol@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 16:12 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:11 jhathaway@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 16:10 jhathaway@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 15:07 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:00 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:09 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:09 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change ips for frack servers - cmooney@cumin1003"
* 14:08 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: change ips for frack servers - cmooney@cumin1003"
* 14:02 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 13:52 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:51 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:49 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:49 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:48 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:47 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:47 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:45 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:41 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:40 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:14 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:11 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:10 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:08 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:06 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 11:53 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 11:51 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 11:50 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 11:48 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 11:30 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:27 fceratto@deploy1003: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:15 elukey@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host kafka-test1006.eqiad.wmnet with OS trixie
* 11:15 taavi@cumin1003: END (PASS) - Cookbook sre.wikireplicas.add-wiki (exit_code=0) for database abstractwiki ([[phab:T420637|T420637]])
* 11:02 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2002.codfw.wmnet
* 10:54 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2006.codfw.wmnet
* 10:51 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2002.codfw.wmnet
* 10:50 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2006.codfw.wmnet
* 10:46 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host poolcounter2005.codfw.wmnet
* 10:43 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host poolcounter2005.codfw.wmnet
* 10:33 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on kafka-test1006.eqiad.wmnet with reason: host reimage
* 10:27 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on kafka-test1006.eqiad.wmnet with reason: host reimage
* 10:18 taavi@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database abstractwiki ([[phab:T420637|T420637]])
* 10:12 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host kafka-test1006.eqiad.wmnet with OS trixie
* 10:04 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 10:03 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:01 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:00 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:58 dpogorzelski@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:57 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
* 09:37 elukey@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
* 09:06 elukey@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 09:05 elukey@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 09:04 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 09:03 elukey@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 08:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:05 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 08:04 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 08:02 elukey@deploy1003: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 07:46 elukey@deploy1003: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 03:06 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:32 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:12 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:07 mwpresync@deploy1003: Finished scap build-images: Publishing wmf/next image (duration: 06m 07s)
* 02:01 mwpresync@deploy1003: Started scap build-images: Publishing wmf/next image
* 01:30 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 01:30 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul1001.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 01:30 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul2001.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 01:29 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 01:12 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
== 2026-03-26 ==
* 21:35 reedy@deploy1003: Finished scap sync-world: Backport for [[gerrit:1260834{{!}}Add Logstash logging for successful passwordless logins]], [[gerrit:1261511{{!}}InitialiseSettings: Remove apiportalwiki from $wmgCentralAuthAutoLoginWikis (T421413)]] (duration: 06m 58s)
* 21:31 reedy@deploy1003: catrope, reedy: Continuing with sync
* 21:30 reedy@deploy1003: catrope, reedy: Backport for [[gerrit:1260834{{!}}Add Logstash logging for successful passwordless logins]], [[gerrit:1261511{{!}}InitialiseSettings: Remove apiportalwiki from $wmgCentralAuthAutoLoginWikis (T421413)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:28 reedy@deploy1003: Started scap sync-world: Backport for [[gerrit:1260834{{!}}Add Logstash logging for successful passwordless logins]], [[gerrit:1261511{{!}}InitialiseSettings: Remove apiportalwiki from $wmgCentralAuthAutoLoginWikis (T421413)]]
* 21:00 suecarmol@deploy1003: Finished scap sync-world: Backport for [[gerrit:1256498{{!}}PersonalDashboard: Add config for Active Discussions (T420785)]] (duration: 13m 53s)
* 20:54 suecarmol@deploy1003: suecarmol: Continuing with sync
* 20:51 suecarmol@deploy1003: suecarmol: Backport for [[gerrit:1256498{{!}}PersonalDashboard: Add config for Active Discussions (T420785)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:46 suecarmol@deploy1003: Started scap sync-world: Backport for [[gerrit:1256498{{!}}PersonalDashboard: Add config for Active Discussions (T420785)]]
* 20:44 kamila@deploy1003: Finished scap sync-world: Backport for [[gerrit:1261545{{!}}Wrap 'centralauthtoken' in a JWT (T420280)]], [[gerrit:1261470{{!}}Enable $wgTempCategoryCollations for testwiki. (T419274 T419049)]] (duration: 37m 32s)
* 20:30 kamila@deploy1003: matmarex, kamila: Continuing with sync
* 20:25 kamila@deploy1003: matmarex, kamila: Backport for [[gerrit:1261545{{!}}Wrap 'centralauthtoken' in a JWT (T420280)]], [[gerrit:1261470{{!}}Enable $wgTempCategoryCollations for testwiki. (T419274 T419049)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:11 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host restbase2039.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:09 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host restbase2039.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART
* 20:08 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>aqs1015.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 20:08 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host restbase2039
* 20:07 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host restbase2039
* 20:06 kamila@deploy1003: Started scap sync-world: Backport for [[gerrit:1261545{{!}}Wrap 'centralauthtoken' in a JWT (T420280)]], [[gerrit:1261470{{!}}Enable $wgTempCategoryCollations for testwiki. (T419274 T419049)]]
* 20:05 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:05 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding restbase2039 to codfw - jhancock@cumin2002"
* 20:05 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding restbase2039 to codfw - jhancock@cumin2002"
* 20:02 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 20:00 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>aqs1015.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:47 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 19:44 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs6003.drmrs.wmnet<nowiki>}</nowiki> and A:liberica
* 18:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:48 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4008.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:42 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/zotero: apply
* 18:42 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/zotero: apply
* 18:42 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
* 18:41 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
* 18:41 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/toolhub: apply
* 18:40 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/toolhub: apply
* 18:40 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/termbox: apply
* 18:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:39 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/termbox: apply
* 18:39 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:38 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 18:37 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 18:37 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 18:37 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 18:36 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 18:36 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4009.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 18:36 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 18:36 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/sessionstore: apply
* 18:35 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/sessionstore: apply
* 18:35 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/recommendation-api: apply
* 18:35 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/recommendation-api: apply
* 18:34 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/rdf-streaming-updater: apply
* 18:34 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/rdf-streaming-updater: apply
* 18:33 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/push-notifications: apply
* 18:33 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/push-notifications: apply
* 18:32 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 18:31 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/proton: apply
* 18:31 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/page-analytics: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/page-analytics: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
* 18:30 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 18:28 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 18:28 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/media-analytics: apply
* 18:28 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/media-analytics: apply
* 18:27 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/machinetranslation: apply
* 18:25 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>sessionstore1006.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 18:21 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/machinetranslation: apply
* 18:21 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/linkrecommendation: apply
* 18:20 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/linkrecommendation: apply
* 18:20 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/ipoid: apply
* 18:19 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/ipoid: apply
* 18:19 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/image-suggestion: apply
* 18:19 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/image-suggestion: apply
* 18:18 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>sessionstore1006.eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 18:18 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/geo-analytics: apply
* 18:17 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/geo-analytics: apply
* 18:17 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams-internal: apply
* 18:16 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams-internal: apply
* 18:15 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventstreams: apply
* 18:14 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventstreams: apply
* 18:13 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 18:13 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 18:13 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
* 18:12 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
* 18:12 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:12 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/editor-analytics: apply
* 18:11 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/editor-analytics: apply
* 18:10 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/edit-analytics: apply
* 18:09 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/edit-analytics: apply
* 18:09 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/echostore: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/echostore: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/device-analytics: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/device-analytics: apply
* 18:08 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/data-gateway: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/data-gateway: apply
* 18:07 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 18:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 18:06 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/commons-impact-analytics: apply
* 18:06 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/commons-impact-analytics: apply
* 18:04 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 18:04 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 18:03 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 18:03 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/cirrus-streaming-updater: apply
* 18:03 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/chart-renderer: apply
* 18:02 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/chart-renderer: apply
* 17:59 rzl@deploy1003: helmfile [eqiad] DONE helmfile.d/services/apertium: apply
* 17:58 rzl@deploy1003: helmfile [eqiad] START helmfile.d/services/apertium: apply
* 17:55 swfrench@deploy1003: Finished scap sync-world: helmfile-only deployment to enable envoy drain on remaining services - [[phab:T364245|T364245]] (duration: 05m 31s)
* 17:52 swfrench@deploy1003: Started scap sync-world: helmfile-only deployment to enable envoy drain on remaining services - [[phab:T364245|T364245]]
* 17:35 kevinbazira@deploy1003: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:39 rzl@deploy1003: Finished scap sync-world: https://gerrit.wikimedia.org/r/1256396 [[phab:T420666|T420666]] (duration: 11m 21s)
* 16:35 rzl@deploy1003: rzl: Continuing with sync
* 16:34 rzl@deploy1003: rzl: https://gerrit.wikimedia.org/r/1256396 [[phab:T420666|T420666]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:31 rzl@deploy1003: Started scap sync-world: https://gerrit.wikimedia.org/r/1256396 [[phab:T420666|T420666]]
* 16:27 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 16:17 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 16:17 blake@deploy1003: Finished scap sync-world: Test deployment to validate deployment server switchover - [[phab:T413974|T413974]] (duration: 31m 09s)
* 16:16 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 16:05 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 15:47 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.clone (exit_code=0) of db1202.eqiad.wmnet onto db1253.eqiad.wmnet
* 15:47 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1253: Pool db1253.eqiad.wmnet in after cloning
* 15:46 blake@deploy1003: Started scap sync-world: Test deployment to validate deployment server switchover - [[phab:T413974|T413974]]
* 15:44 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 15:43 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul2002.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 15:43 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul1002.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 15:33 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 15:30 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
* 15:30 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
* 15:23 blake@dns1004: END - running authdns-update
* 15:22 bjensen: updating dns for the deployment host switchover
* 15:21 blake@dns1004: START - running authdns-update
* 15:19 blake@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases2003.codfw.wmnet,releases1003.eqiad.wmnet with reason: Deployment server switchover
* 15:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:01 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1253: Pool db1253.eqiad.wmnet in after cloning
* 14:39 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:28 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:23 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:22 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:21 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:21 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 14:21 elukey@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:20 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 14:20 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 14:20 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 14:19 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 14:18 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 14:18 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 14:17 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 14:01 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1202: Pool db1202.eqiad.wmnet in after cloning
* 13:57 jynus: dropping ms-backup[12]00[12] grants from backup1-* dbs [[phab:T420464|T420464]]
* 13:56 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1070.eqiad.wmnet
* 13:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1070.eqiad.wmnet
* 13:55 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1097.eqiad.wmnet
* 13:55 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1097.eqiad.wmnet
* 13:54 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1055.eqiad.wmnet
* 13:53 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1055.eqiad.wmnet
* 13:46 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:45 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 13:40 sergi0: UTC afternoon backport window done
* 13:39 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259132{{!}}GrowthExperiments: scale edit and thanks query limit to more wikis (T341599)]] (duration: 09m 17s)
* 13:35 sgimeno@deploy2002: sgimeno: Continuing with sync
* 13:32 sgimeno@deploy2002: sgimeno: Backport for [[gerrit:1259132{{!}}GrowthExperiments: scale edit and thanks query limit to more wikis (T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1259132{{!}}GrowthExperiments: scale edit and thanks query limit to more wikis (T341599)]]
* 13:26 jforrester@deploy2002: Finished deploy [integration/docroot@f021d3f]: {{Gerrit|Ia936ecd68e675cff2925dba933e3b67b9bad4cd6}} (duration: 00m 11s)
* 13:26 jforrester@deploy2002: Started deploy [integration/docroot@f021d3f]: {{Gerrit|Ia936ecd68e675cff2925dba933e3b67b9bad4cd6}}
* 13:24 kamila@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256384{{!}}Temporarily add shellbox-icu to $wgShellboxUrls (T419049 T419242 T419274)]] (duration: 07m 16s)
* 13:20 kamila@deploy2002: kamila: Continuing with sync
* 13:19 kamila@deploy2002: kamila: Backport for [[gerrit:1256384{{!}}Temporarily add shellbox-icu to $wgShellboxUrls (T419049 T419242 T419274)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:17 kamila@deploy2002: Started scap sync-world: Backport for [[gerrit:1256384{{!}}Temporarily add shellbox-icu to $wgShellboxUrls (T419049 T419242 T419274)]]
* 13:15 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1202: Pool db1202.eqiad.wmnet in after cloning
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:15 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 13:14 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 13:13 kamila@deploy2002: Finished scap sync-world: Backport for [[gerrit:1261420{{!}}cswiki: lift IP cap for editathon (T421305)]] (duration: 07m 22s)
* 13:12 btullis@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 13:09 kamila@deploy2002: kamila, anzx: Continuing with sync
* 13:08 jynus: deploying new grants for new ms-backup hosts and removing old ones [[phab:T420464|T420464]]
* 13:08 kamila@deploy2002: kamila, anzx: Backport for [[gerrit:1261420{{!}}cswiki: lift IP cap for editathon (T421305)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:06 kamila@deploy2002: Started scap sync-world: Backport for [[gerrit:1261420{{!}}cswiki: lift IP cap for editathon (T421305)]]
* 13:03 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:52 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:43 cdanis: puppet reenabled on drmrs, CIDERGRINDER deployed
* 12:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 12:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 12:23 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:12 cdanis: 💔cdanis@cumin1003.eqiad.wmnet ~ 🕗☕ sudo cumin 'A:cp-drmrs' 'disable-puppet "cdanis CIDER"'
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1004.eqiad.wmnet
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1006.eqiad.wmnet
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1003.eqiad.wmnet
* 12:02 elukey@cumin1003: END (PASS) - Cookbook sre.postgresql.postgres-init (exit_code=0)
* 12:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1002.eqiad.wmnet
* 12:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1005.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1006.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1005.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1004.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1003.eqiad.wmnet
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1002.eqiad.wmnet
* 11:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1001.eqiad.wmnet
* 11:44 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:43 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:43 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1008.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1007.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host kafka-logging1006.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:41 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:41 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 11:41 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 11:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1001.eqiad.wmnet
* 11:38 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:37 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 11:33 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:32 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1202: Depool db1202.eqiad.wmnet to then clone it to db1253.eqiad.wmnet - fceratto@cumin1003
* 11:31 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 11:31 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1202: Depool db1202.eqiad.wmnet to then clone it to db1253.eqiad.wmnet - fceratto@cumin1003
* 11:31 fceratto@cumin1003: START - Cookbook sre.mysql.clone of db1202.eqiad.wmnet onto db1253.eqiad.wmnet
* 11:31 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 11:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1001.eqiad.wmnet
* 11:22 elukey@cumin1003: START - Cookbook sre.postgresql.postgres-init
* 11:22 elukey@cumin1003: END (FAIL) - Cookbook sre.postgresql.postgres-init (exit_code=99)
* 11:22 elukey@cumin1003: START - Cookbook sre.postgresql.postgres-init
* 11:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1001.eqiad.wmnet
* 11:15 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: sync
* 11:14 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: sync
* 11:14 elukey@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: sync
* 11:13 elukey@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: sync
* 11:07 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 11:04 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260802{{!}}SI: Enable on bnwiki, itwiki, simplewiki, and plwiki (T415529)]] (duration: 09m 23s)
* 10:59 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 10:56 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1260802{{!}}SI: Enable on bnwiki, itwiki, simplewiki, and plwiki (T415529)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:54 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1260802{{!}}SI: Enable on bnwiki, itwiki, simplewiki, and plwiki (T415529)]]
* 10:33 oblivian@deploy1003: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: sync
* 10:32 oblivian@deploy1003: helmfile [eqiad] START helmfile.d/services/wikifeeds: sync
* 10:32 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 10:32 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 10:31 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:23 tappof@cumin1003: END (PASS) - Cookbook sre.o11y.thanos-compact-restart (exit_code=0)
* 10:23 tappof@cumin1003: START - Cookbook sre.o11y.thanos-compact-restart
* 10:22 tappof@cumin1003: END (PASS) - Cookbook sre.o11y.thanos-compact-restart (exit_code=0)
* 10:22 tappof@cumin1003: START - Cookbook sre.o11y.thanos-compact-restart
* 10:12 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s1
* 10:11 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s1
* 10:05 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s4
* 10:05 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s4
* 09:58 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s8
* 09:58 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s8
* 09:53 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 09:53 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:52 hashar: Starting Gerrit on the replica / gerrit1003
* 09:51 hashar: Stopping Gerrit on the replica / gerrit1003 to clear web sessions
* 09:51 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s7
* 09:50 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s7
* 09:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 09:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 09:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 09:46 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 09:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 09:44 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s3
* 09:43 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s3
* 09:42 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 09:36 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s2
* 09:36 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s2
* 09:31 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:29 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s5
* 09:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:29 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s5
* 09:26 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 09:22 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section s6
* 09:22 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 09:22 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section s6
* 09:18 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 09:16 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section es6
* 09:15 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section es6
* 09:13 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 09:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 09:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 09:08 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section x3
* 09:07 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section x3
* 09:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 09:02 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.finalize (exit_code=0) for the switch from codfw to eqiad for section x1
* 09:01 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.finalize for the switch from codfw to eqiad for section x1
* 09:01 federico3: starting [[phab:T416708|T416708]] - disabling circular replication on core dbs
* 08:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 08:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 08:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 08:43 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:41 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:32 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 08:27 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 08:18 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 08:11 hashar@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 05:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Downgrade clouddb1022 to 10.11.13
== 2026-03-25 ==
* 23:59 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul2001.codfw.wmnet with reason: [[phab:T421330|T421330]]
* 23:58 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on zuul1001.eqiad.wmnet with reason: [[phab:T421330|T421330]]
* 23:29 mutante: zuul1001 - installed mariadb-client - connected once to zuul db on m1-master; mysql> truncate "alembic_version"; - systemctl restart zuul-web - This fixed the zuul-web service. finally no error in systemctl status. ([[phab:T405119|T405119]])
* 21:38 ryankemper: [opensearch-k8s] [[phab:T414484|T414484]] Depooled eqiad; change verified working (now when I do `host k8s-ingress-dse-aa.discovery.wmnet` from `cumin1003`, and then reverse-lookup the resulting IP, I get a codfw address); so traffic is now routing to dse-k8s-codfw
* 21:35 ryankemper@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 21:30 Dreamy_Jazz: Created cusi_case, cusi_user, and cusi_signal on bnwiki, itwiki, simplewiki, plwiki for [[phab:T415529|T415529]]
* 21:27 ryankemper: [opensearch-k8s] [[phab:T414484|T414484]] Getting ready to depool `dnsdisc=k8s-ingress-dse-aa,name=eqiad`, leaving codfw pooled. This will get us ready for a full rolling-upgrade of the dse-k8s-eqiad cluster tomorrow.
* 21:23 Dreamy_Jazz: Evening UTC backport window done
* 21:08 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260797{{!}}SuggestedInvestigations: Import session into signal matching job (T421062)]] (duration: 10m 26s)
* 21:04 kharlan@deploy2002: kharlan: Continuing with sync
* 21:01 kharlan@deploy2002: kharlan: Backport for [[gerrit:1260797{{!}}SuggestedInvestigations: Import session into signal matching job (T421062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:58 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1260797{{!}}SuggestedInvestigations: Import session into signal matching job (T421062)]]
* 20:51 eevans@cumin1003: END (ERROR) - Cookbook sre.cassandra.roll-reboot (exit_code=97) rolling reboot on P<nowiki>{</nowiki>sessionstore[1004-1006].eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 20:43 aaron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259183{{!}}Add Analytics APIs to the RestSandbox (T419429)]] (duration: 08m 33s)
* 20:38 aaron@deploy2002: aaron: Continuing with sync
* 20:36 aaron@deploy2002: aaron: Backport for [[gerrit:1259183{{!}}Add Analytics APIs to the RestSandbox (T419429)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:34 aaron@deploy2002: Started scap sync-world: Backport for [[gerrit:1259183{{!}}Add Analytics APIs to the RestSandbox (T419429)]]
* 20:30 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247073{{!}}Deploy temporary accounts to ruwiki (T413771)]] (duration: 11m 04s)
* 20:25 jdlrobson@deploy2002: stran, jdlrobson: Continuing with sync
* 20:21 jdlrobson@deploy2002: stran, jdlrobson: Backport for [[gerrit:1247073{{!}}Deploy temporary accounts to ruwiki (T413771)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:19 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1247073{{!}}Deploy temporary accounts to ruwiki (T413771)]]
* 20:17 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255775{{!}}Close the legacy-vector dblist (T421289)]] (duration: 07m 42s)
* 20:14 aokoth@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:14 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:13 aokoth@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:13 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:12 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 20:12 aokoth@cumin1003: END (FAIL) - Cookbook sre.gitlab.upgrade (exit_code=99) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 20:12 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1255775{{!}}Close the legacy-vector dblist (T421289)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255775{{!}}Close the legacy-vector dblist (T421289)]]
* 20:05 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy (exit_code=0) rolling reboot on P<nowiki>{</nowiki>hcaptcha-proxy7002.wikimedia.org<nowiki>}</nowiki> and A:hcaptcha-proxy
* 20:01 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-hcaptcha-proxy rolling reboot on P<nowiki>{</nowiki>hcaptcha-proxy7002.wikimedia.org<nowiki>}</nowiki> and A:hcaptcha-proxy
* 20:00 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>sessionstore[1004-1006].eqiad.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:34 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pybal-test2003.codfw.wmnet
* 19:30 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host pybal-test2003.codfw.wmnet
* 19:26 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 19:24 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>sessionstore[2004-2006].codfw.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2011.codfw.wmnet
* 19:17 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2011.codfw.wmnet
* 19:17 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 19:14 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2011.codfw.wmnet with reason: Planned reboot
* 19:11 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T421278|T421278]]
* 19:11 cdobbins@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 19:07 cdobbins@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 19:00 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2012.codfw.wmnet
* 18:57 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2012.codfw.wmnet
* 18:53 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>sessionstore[2004-2006].codfw.wmnet<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 18:51 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
* 18:51 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
* 18:50 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 18:50 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 18:49 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
* 18:49 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
* 18:49 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/toolhub: apply
* 18:48 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/toolhub: apply
* 18:48 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/termbox: apply
* 18:47 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/termbox: apply
* 18:47 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 18:47 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 18:47 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 18:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 18:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 18:46 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs2012.codfw.wmnet with reason: Planned reboot
* 18:46 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 18:46 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 18:45 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 18:45 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 18:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 18:44 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/sessionstore: apply
* 18:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/sessionstore: apply
* 18:44 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/recommendation-api: apply
* 18:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/recommendation-api: apply
* 18:43 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/rdf-streaming-updater: apply
* 18:43 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/rdf-streaming-updater: apply
* 18:43 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/push-notifications: apply
* 18:42 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/push-notifications: apply
* 18:42 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 18:41 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
* 18:41 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:40 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/page-analytics: apply
* 18:40 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/page-analytics: apply
* 18:39 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 18:39 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
* 18:37 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 18:37 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 18:37 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7001.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:37 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 18:35 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 18:34 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/media-analytics: apply
* 18:34 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/media-analytics: apply
* 18:33 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
* 18:29 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:28 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
* 18:28 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/linkrecommendation: apply
* 18:26 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/linkrecommendation: apply
* 18:26 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/ipoid: apply
* 18:26 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/ipoid: apply
* 18:25 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases2003.codfw.wmnet with reason: debug java install
* 18:25 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7002.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 18:25 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases1003.eqiad.wmnet with reason: debug java install
* 18:25 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/image-suggestion: apply
* 18:24 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/image-suggestion: apply
* 18:23 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/geo-analytics: apply
* 18:23 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/geo-analytics: apply
* 18:23 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams-internal: apply
* 18:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams-internal: apply
* 18:22 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventstreams: apply
* 18:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventstreams: apply
* 18:21 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 18:21 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 18:20 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 18:20 mutante: releases1003 - apt-get upgrade - envoyproxy, python3-wmflib
* 18:20 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 18:20 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 18:19 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 18:19 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 18:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 18:18 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/editor-analytics: apply
* 18:18 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/editor-analytics: apply
* 18:17 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/edit-analytics: apply
* 18:17 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/edit-analytics: apply
* 18:16 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/echostore: apply
* 18:16 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/echostore: apply
* 18:15 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/device-analytics: apply
* 18:15 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/device-analytics: apply
* 18:15 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 18:14 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 18:14 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/data-gateway: apply
* 18:14 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/data-gateway: apply
* 18:14 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 18:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 18:13 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/commons-impact-analytics: apply
* 18:13 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/commons-impact-analytics: apply
* 18:12 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 18:12 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
* 18:11 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 18:11 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/cirrus-streaming-updater: apply
* 18:11 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/chart-renderer: apply
* 18:11 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/chart-renderer: apply
* 18:09 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/apertium: apply
* 18:09 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/apertium: apply
* 17:29 sfaci@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:29 sfaci@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 17:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 17:23 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on releases1003.eqiad.wmnet with reason: debug java install
* 17:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 16:44 ebysans@deploy2002: Finished deploy [analytics/refinery@80c527b] (thin): Regular analytics weekly train THIN [analytics/refinery@80c527b6] (duration: 01m 59s)
* 16:42 ebysans@deploy2002: Started deploy [analytics/refinery@80c527b] (thin): Regular analytics weekly train THIN [analytics/refinery@80c527b6]
* 16:42 SandraEbele_: Deploying Refinery as part of weekly deployment train
* 16:41 ebysans@deploy2002: Finished deploy [analytics/refinery@80c527b]: Regular analytics weekly train [analytics/refinery@80c527b6] (duration: 04m 32s)
* 16:37 ebysans@deploy2002: Started deploy [analytics/refinery@80c527b]: Regular analytics weekly train [analytics/refinery@80c527b6]
* 16:22 ebysans@deploy2002: Finished deploy [analytics/refinery@80c527b] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@80c527b6] (duration: 01m 58s)
* 16:22 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:21 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:21 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:20 ebysans@deploy2002: Started deploy [analytics/refinery@80c527b] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@80c527b6]
* 16:20 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 16:19 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 16:18 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:18 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:06 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 16:05 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 16:05 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 16:04 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 16:03 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:02 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 16:02 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 16:01 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:51 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:50 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:42 blake@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244628{{!}}debug: reorder debug backends for eqiad switchover (T413974)]] (duration: 07m 41s)
* 15:41 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:37 blake@deploy2002: blake: Continuing with sync
* 15:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:37 blake@deploy2002: blake: Backport for [[gerrit:1244628{{!}}debug: reorder debug backends for eqiad switchover (T413974)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:34 blake@deploy2002: Started scap sync-world: Backport for [[gerrit:1244628{{!}}debug: reorder debug backends for eqiad switchover (T413974)]]
* 15:34 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 15:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 15:32 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-unlock-scap (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:32 root@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Datacenter switchover from codfw to eqiad - (duration: 91m 45s)
* 15:32 root@deploy2002: Forcefully removing global lock: Datacenter switchover from codfw to eqiad -
* 15:32 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-unlock-scap for datacenter switchover from codfw to eqiad
* 15:31 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:26 blake@dns1004: END - running authdns-update
* 15:24 blake@dns1004: START - running authdns-update
* 15:24 elukey@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'sync'.
* 15:23 elukey@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'sync'.
* 15:18 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters for datacenter switchover from codfw to eqiad
* 15:18 blake@dns1004: END - running authdns-update
* 15:16 blake@dns1004: START - running authdns-update
* 15:14 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-restore-ttl (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:13 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-restore-ttl for datacenter switchover from codfw to eqiad
* 15:11 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:10 root@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 15:09 root@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 15:08 root@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 15:07 root@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 15:07 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance for datacenter switchover from codfw to eqiad
* 15:07 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:07 root@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: sync
* 15:07 root@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: sync
* 15:07 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner for datacenter switchover from codfw to eqiad
* 15:02 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) for datacenter switchover from codfw to eqiad
* 15:02 blake@cumin1003: MediaWiki read-only period ends at: 2026-03-25 15:02:52.921926
* 14:55 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:53 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance for datacenter switchover from codfw to eqiad
* 14:52 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:52 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks for datacenter switchover from codfw to eqiad
* 14:51 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:46 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl for datacenter switchover from codfw to eqiad
* 14:28 cdanis: 💙cdanis@apt1002.wikimedia.org ~ 🕥☕ sudo -i reprepro --component main --restrict cidergrinder update trixie-wikimedia
* 14:28 cdanis: 💙cdanis@apt1002.wikimedia.org ~ 🕥☕ sudo -i reprepro --component main --restrict cidergrinder update bullseye-wikimedia
* 14:24 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['phab2002']
* 14:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 14:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 14:17 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['phab2002']
* 14:14 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:13 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:13 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:12 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:11 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:08 gengh@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:07 gengh@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 gengh@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:07 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:06 gengh@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:06 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:06 gengh@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:05 gengh@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:00 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-lock-scap (exit_code=0) for datacenter switchover from codfw to eqiad
* 14:00 root@deploy2002: Locking from deployment [ALL REPOSITORIES]: Datacenter switchover from codfw to eqiad -
* 14:00 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-lock-scap for datacenter switchover from codfw to eqiad
* 13:49 otto@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260091{{!}}EventStreamConfig - Increase spark_job_ingestion_scale for larger event streams (T360794 T351225)]] (duration: 07m 48s)
* 13:45 otto@deploy2002: otto: Continuing with sync
* 13:45 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:44 otto@deploy2002: otto: Backport for [[gerrit:1260091{{!}}EventStreamConfig - Increase spark_job_ingestion_scale for larger event streams (T360794 T351225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 otto@deploy2002: Started scap sync-world: Backport for [[gerrit:1260091{{!}}EventStreamConfig - Increase spark_job_ingestion_scale for larger event streams (T360794 T351225)]]
* 13:32 awight@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260614{{!}}[beta] Kill synthetic refs with feature flag (T421055)]], [[gerrit:1251193{{!}}idwiki: Remove unused user groups on Indonesian Wikipedia (T419105)]], [[gerrit:1251200{{!}}ptwiki: Enable block action for the abuse filter (T419312)]], [[gerrit:1256748{{!}}ptwiki: Add suppressredirect to autoreviewer and rollbacker user groups (T420704)]] (duration: 11m 33s)
* 13:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:27 awight@deploy2002: codenamenoreste, awight, gerrit-patch-uploader: Continuing with sync
* {{safesubst:SAL entry|1=13:23 awight@deploy2002: codenamenoreste, awight, gerrit-patch-uploader: Backport for [[gerrit:1260614{{!}}[beta] Kill synthetic refs with feature flag (T421055)]], [[gerrit:1251193{{!}}idwiki: Remove unused user groups on Indonesian Wikipedia (T419105)]], [[gerrit:1251200{{!}}ptwiki: Enable block action for the abuse filter (T419312)]], [[gerrit:1256748{{!}}ptwiki: Add suppressredirect to autoreviewer and rollbacker user groups (T420704)]}}
* 13:20 awight@deploy2002: Started scap sync-world: Backport for [[gerrit:1260614{{!}}[beta] Kill synthetic refs with feature flag (T421055)]], [[gerrit:1251193{{!}}idwiki: Remove unused user groups on Indonesian Wikipedia (T419105)]], [[gerrit:1251200{{!}}ptwiki: Enable block action for the abuse filter (T419312)]], [[gerrit:1256748{{!}}ptwiki: Add suppressredirect to autoreviewer and rollbacker user groups (T420704)]]
* 13:17 dcausse@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260045{{!}}Revert^2 "search: use the discovery ns record for the semanticsearch cluster"]] (duration: 10m 20s)
* 13:12 dcausse@deploy2002: dcausse: Continuing with sync
* 13:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:09 dcausse@deploy2002: dcausse: Backport for [[gerrit:1260045{{!}}Revert^2 "search: use the discovery ns record for the semanticsearch cluster"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:07 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:06 dcausse@deploy2002: Started scap sync-world: Backport for [[gerrit:1260045{{!}}Revert^2 "search: use the discovery ns record for the semanticsearch cluster"]]
* 13:06 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:02 XioNoX: Inter.Link - DDoS - Activation of automatic reroute
* 12:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:51 marostegui@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 12:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Downgrade clouddb1022 to 10.11.15
* 12:41 marostegui@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 12:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-mariadb1002.eqiad.wmnet
* 12:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-test-coord1002.eqiad.wmnet
* 12:38 mszwarc@deploy2002: mwscript-k8s job started: foreachwikiindblist all demoteIneligibleUsers.php --relay-log checkuser=metawiki --relay-log suppress=metawiki # [[phab:T418580|T418580]]
* 12:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-test-coord1002.eqiad.wmnet
* 12:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-mariadb1002.eqiad.wmnet
* 12:33 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:32 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs1028.eqiad.wmnet
* 12:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host wdqs1028.eqiad.wmnet
* 12:24 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:19 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260617{{!}}Allow for demoting 2FA-less members of further 6 groups (T418580)]] (duration: 10m 23s)
* 12:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wdqs2009.codfw.wmnet
* 12:12 mszwarc@deploy2002: mszwarc: Continuing with sync
* 12:11 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1260617{{!}}Allow for demoting 2FA-less members of further 6 groups (T418580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:09 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1260617{{!}}Allow for demoting 2FA-less members of further 6 groups (T418580)]]
* 12:07 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host wdqs2009.codfw.wmnet
* 12:07 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl2002.codfw.wmnet
* 12:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl2002.codfw.wmnet
* 11:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl2001.codfw.wmnet
* 11:53 marostegui: Restart clouddb1022:s3 to enable error_log [[phab:T420177|T420177]]
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl2001.codfw.wmnet
* 11:51 jayme: migrated wikikube apiservers (eqiad and codfw) to IPIP - [[phab:T420436|T420436]]
* 11:49 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-master-codfw@codfw
* 11:49 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 11:48 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 11:46 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-master-eqiad@eqiad
* 11:46 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 11:45 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 11:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:43 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-master-codfw@codfw
* 11:41 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-master-eqiad@eqiad
* 11:40 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1002.eqiad.wmnet
* 11:38 mvernon@deploy2002: helmfile [codfw] DONE helmfile.d/services/kartotherian: apply
* 11:36 mvernon@deploy2002: helmfile [codfw] START helmfile.d/services/kartotherian: apply
* 11:21 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:18 mvernon@deploy2002: helmfile [codfw] DONE helmfile.d/services/kartotherian: apply
* 11:16 mvernon@deploy2002: helmfile [codfw] DONE helmfile.d/services/tegola-vector-tiles: apply
* 11:15 mvernon@deploy2002: helmfile [codfw] START helmfile.d/services/tegola-vector-tiles: apply
* 11:15 mvernon@deploy2002: helmfile [eqiad] DONE helmfile.d/services/tegola-vector-tiles: apply
* 11:14 mvernon@deploy2002: helmfile [eqiad] START helmfile.d/services/tegola-vector-tiles: apply
* 11:07 mvernon@deploy2002: helmfile [codfw] START helmfile.d/services/kartotherian: apply
* 11:07 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis abstractwiki in section s5
* 11:07 mvernon@deploy2002: helmfile [eqiad] DONE helmfile.d/services/kartotherian: apply
* 11:05 mvernon@deploy2002: helmfile [eqiad] START helmfile.d/services/kartotherian: apply
* 10:55 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:53 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis abstractwiki in section s5
* 10:45 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 10:33 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:27 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:26 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:21 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 10:20 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 10:20 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 10:19 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 10:01 elukey@puppetserver1001: conftool action : set/pooled=true; selector: dnsdisc=kartotherian,name=codfw
* 09:58 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:57 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:52 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:52 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:51 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:51 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:46 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:45 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:45 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:44 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:05 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=aux-k8s-worker200[2-5].codfw.wmnet,cluster=aux-k8s,service=kubesvc
* 09:04 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=aux-k8s-worker200[6-9].codfw.wmnet,cluster=aux-k8s,service=kubesvc
* 09:04 elukey@puppetserver1001: conftool action : set/pooled=yes:weight=10; selector: name=aux-k8s-worker100[6-9].eqiad.wmnet,cluster=aux-k8s,service=kubesvc
* 08:55 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker200[6-9].eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:55 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker200[6-9].eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:35 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1009.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:35 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1008.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1007.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/weight=10; selector: name=aux-k8s-worker1006.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1009.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1008.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1007.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:34 elukey@puppetserver1001: conftool action : set/pooled=yes; selector: name=aux-k8s-worker1006.eqiad.wmnet,cluster=kubernetes,service=kubesvc
* 08:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device fasw2-c8b-codfw
* 08:29 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device fasw2-c8b-codfw
* 08:29 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device fasw2-c8a-codfw
* 08:29 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device fasw2-c8a-codfw
* 08:10 hashar@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 00:33 rzl@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 00:23 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 00:23 rzl@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 00:23 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 00:22 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 00:22 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 00:22 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 00:21 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
* 00:21 rzl@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
* 00:21 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/toolhub: apply
* 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/toolhub: apply
* 00:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/termbox: apply
* 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/termbox: apply
* 00:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/tegola-vector-tiles: apply
* 00:20 rzl@deploy2002: helmfile [staging] START helmfile.d/services/tegola-vector-tiles: apply
* 00:19 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 00:19 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 00:18 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 00:18 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 00:18 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 00:18 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 00:17 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 00:17 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 00:17 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 00:17 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 00:17 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 00:16 rzl@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
* 00:16 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/sessionstore: apply
* 00:16 rzl@deploy2002: helmfile [staging] START helmfile.d/services/sessionstore: apply
* 00:16 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/recommendation-api: apply
* 00:15 rzl@deploy2002: helmfile [staging] START helmfile.d/services/recommendation-api: apply
* 00:15 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rdf-streaming-updater: apply
* 00:15 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rdf-streaming-updater: apply
* 00:15 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1023.eqiad.wmnet with OS bookworm
* 00:15 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/push-notifications: apply
* 00:15 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:15 rzl@deploy2002: helmfile [staging] START helmfile.d/services/push-notifications: apply
* 00:15 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
* 00:14 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:14 rzl@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
* 00:13 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/page-analytics: apply
* 00:13 rzl@deploy2002: helmfile [staging] START helmfile.d/services/page-analytics: apply
* 00:13 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 00:13 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 00:13 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 00:12 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 00:12 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 00:11 rzl@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
* 00:10 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/media-analytics: apply
* 00:10 rzl@deploy2002: helmfile [staging] START helmfile.d/services/media-analytics: apply
* 00:09 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
* 00:07 rzl@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
* 00:07 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/linkrecommendation: apply
* 00:06 rzl@deploy2002: helmfile [staging] START helmfile.d/services/linkrecommendation: apply
* 00:06 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/kartotherian: apply
* 00:06 rzl@deploy2002: helmfile [staging] START helmfile.d/services/kartotherian: apply
* 00:06 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/ipoid: apply
* 00:06 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1022.eqiad.wmnet with OS bookworm
* 00:06 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:06 rzl@deploy2002: helmfile [staging] START helmfile.d/services/ipoid: apply
* 00:06 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/image-suggestion: apply
* 00:06 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/image-suggestion: apply
* 00:05 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/geo-analytics: apply
* 00:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/geo-analytics: apply
* 00:05 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams-internal: apply
* 00:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams-internal: apply
* 00:04 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventstreams: apply
* 00:04 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventstreams: apply
* 00:04 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 00:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1021.eqiad.wmnet with OS bookworm
* 00:04 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:04 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 00:04 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 00:04 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 00:03 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 00:03 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 00:03 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 00:02 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 00:02 rzl@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 00:02 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/editor-analytics: apply
* 00:01 rzl@deploy2002: helmfile [staging] START helmfile.d/services/editor-analytics: apply
* 00:01 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/edit-analytics: apply
* 00:01 rzl@deploy2002: helmfile [staging] START helmfile.d/services/edit-analytics: apply
* 00:00 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/echostore: apply
* 00:00 rzl@deploy2002: helmfile [staging] START helmfile.d/services/echostore: apply
* 00:00 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/device-analytics: apply
* 00:00 rzl@deploy2002: helmfile [staging] START helmfile.d/services/device-analytics: apply
== 2026-03-24 ==
* 23:59 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 23:59 rzl@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 23:59 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/data-gateway: apply
* 23:59 rzl@deploy2002: helmfile [staging] START helmfile.d/services/data-gateway: apply
* 23:58 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 23:58 rzl@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 23:58 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/commons-impact-analytics: apply
* 23:58 rzl@deploy2002: helmfile [staging] START helmfile.d/services/commons-impact-analytics: apply
* 23:56 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 23:56 rzl@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
* 23:56 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/cirrus-streaming-updater: apply
* 23:56 rzl@deploy2002: helmfile [staging] START helmfile.d/services/cirrus-streaming-updater: apply
* 23:54 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/chart-renderer: apply
* 23:53 rzl@deploy2002: helmfile [staging] START helmfile.d/services/chart-renderer: apply
* 23:53 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1023.eqiad.wmnet with reason: host reimage
* 23:53 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/apertium: apply
* 23:52 rzl@deploy2002: helmfile [staging] START helmfile.d/services/apertium: apply
* 23:46 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1022.eqiad.wmnet with reason: host reimage
* 23:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1021.eqiad.wmnet with reason: host reimage
* 23:19 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1023.eqiad.wmnet with OS bookworm
* 23:19 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1022.eqiad.wmnet with OS bookworm
* 23:18 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1021.eqiad.wmnet with OS bookworm
* 23:16 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-videoscaler: apply
* 23:16 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-videoscaler: apply
* 23:15 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-videoscaler: apply
* 23:15 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-videoscaler: apply
* 22:03 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1260118{{!}}Drop inactive simple summary surveys (T389393)]] (duration: 08m 15s)
* 21:57 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:57 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1260118{{!}}Drop inactive simple summary surveys (T389393)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1260118{{!}}Drop inactive simple summary surveys (T389393)]]
* 21:52 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259147{{!}}Address FIXME and drop not selector for section headings (T420085)]] (duration: 13m 11s)
* 21:47 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:44 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1259147{{!}}Address FIXME and drop not selector for section headings (T420085)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:38 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1259147{{!}}Address FIXME and drop not selector for section headings (T420085)]]
* 21:00 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=frwiki --source-pseudo-namespace=Abstract_ --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:55 jforrester@deploy2002: mwscript-k8s job started: moveBatch --wiki=frwiki '--u=Jdforrester (WMF)' --r=[[phab:T420654|T420654]] --noredirects /home/jforrester/T420654-frwiki-move # [[phab:T420654|T420654]] abstract: is now an interwiki; manual fix
* 20:55 jforrester@deploy2002: mwscript-k8s job started: moveBatch '--u=Jdforrester (WMF)' --r=[[phab:T420654|T420654]] --noredirects /home/jforrester/T420654-frwiki-move # [[phab:T420654|T420654]] abstract: is now an interwiki; manual fix
* 20:47 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=ptwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:46 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=idwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:46 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=frwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:45 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=eswiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:39 jforrester@deploy2002: mwscript-k8s job started: maintenance/namespaceDupes.php --wiki=enwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:39 jforrester@deploy2002: mwscript-k8s job started: sql extensions/WikimediaMaintenance/maintenance/namespaceDupes.php --wiki=enwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:38 jforrester@deploy2002: mwscript-k8s job started: sql maintenance/namespaceDupes.php --wiki=enwiki --fix # [[phab:T420654|T420654]] abstract: is now an interwiki
* 20:38 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256433{{!}}[wikifunctions] Drop m.wikifunctions.org from lists, we've not used it for years]], [[gerrit:1250114{{!}}Move GrowthExperiments REST API definition to IS]], [[gerrit:1259993{{!}}dumpInterwiki: Re-generate to add Abstract Wikipedia (and others) (T420654)]] (duration: 07m 46s)
* 20:33 jforrester@deploy2002: jforrester: Continuing with sync
* 20:32 jforrester@deploy2002: jforrester: Backport for [[gerrit:1256433{{!}}[wikifunctions] Drop m.wikifunctions.org from lists, we've not used it for years]], [[gerrit:1250114{{!}}Move GrowthExperiments REST API definition to IS]], [[gerrit:1259993{{!}}dumpInterwiki: Re-generate to add Abstract Wikipedia (and others) (T420654)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified the
* 20:30 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1256433{{!}}[wikifunctions] Drop m.wikifunctions.org from lists, we've not used it for years]], [[gerrit:1250114{{!}}Move GrowthExperiments REST API definition to IS]], [[gerrit:1259993{{!}}dumpInterwiki: Re-generate to add Abstract Wikipedia (and others) (T420654)]]
* {{safesubst:SAL entry|1=20:27 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259967{{!}}Set json object before setting Abstract Wiki Id (T420916)]], [[gerrit:1259994{{!}}AbstractPreview: apply selected preview language lang/dir to abstract preview body (T420687)]], [[gerrit:1260092{{!}}AbstractTitle: Handle pageinfo responses without normalized titles (T420725)]], [[gerrit:1259992{{!}}[abstractwiki] Don't list abstract as a langlist entry}}
* 20:22 jforrester@deploy2002: jforrester: Continuing with sync
* 20:22 jforrester@deploy2002: jforrester: Backport for [[gerrit:1259967{{!}}Set json object before setting Abstract Wiki Id (T420916)]], [[gerrit:1259994{{!}}AbstractPreview: apply selected preview language lang/dir to abstract preview body (T420687)]], [[gerrit:1260092{{!}}AbstractTitle: Handle pageinfo responses without normalized titles (T420725)]], [[gerrit:1259992{{!}}[abstractwiki] Don't list abstract as a langlist entry (T420654)]] s
* {{safesubst:SAL entry|1=20:20 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1259967{{!}}Set json object before setting Abstract Wiki Id (T420916)]], [[gerrit:1259994{{!}}AbstractPreview: apply selected preview language lang/dir to abstract preview body (T420687)]], [[gerrit:1260092{{!}}AbstractTitle: Handle pageinfo responses without normalized titles (T420725)]], [[gerrit:1259992{{!}}[abstractwiki] Don't list abstract as a langlist entry}}
* 20:12 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1242542{{!}}Generate our own logo thumbnails rather than using MediaWiki's (T414048)]], [[gerrit:1250095{{!}}Enwikinews: Only enable flaggedRevs in article namespace (T418066)]], [[gerrit:1252684{{!}}Disable magic links on afwiki (T420142)]] (duration: 09m 22s)
* 20:08 jforrester@deploy2002: jforrester, pppery: Continuing with sync
* 20:05 jforrester@deploy2002: jforrester, pppery: Backport for [[gerrit:1242542{{!}}Generate our own logo thumbnails rather than using MediaWiki's (T414048)]], [[gerrit:1250095{{!}}Enwikinews: Only enable flaggedRevs in article namespace (T418066)]], [[gerrit:1252684{{!}}Disable magic links on afwiki (T420142)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:03 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1242542{{!}}Generate our own logo thumbnails rather than using MediaWiki's (T414048)]], [[gerrit:1250095{{!}}Enwikinews: Only enable flaggedRevs in article namespace (T418066)]], [[gerrit:1252684{{!}}Disable magic links on afwiki (T420142)]]
* 19:42 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:42 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:41 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:39 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1258300{{!}}tests: Make many things static for PHPUnit 10 (T420844)]], [[gerrit:1258301{{!}}phpunit.xml: Update configuration for PHPUnit 10 (T420844)]] (duration: 07m 21s)
* 19:35 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2005.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 19:35 reedy@deploy2002: reedy: Continuing with sync
* 19:34 reedy@deploy2002: reedy: Backport for [[gerrit:1258300{{!}}tests: Make many things static for PHPUnit 10 (T420844)]], [[gerrit:1258301{{!}}phpunit.xml: Update configuration for PHPUnit 10 (T420844)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:32 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1258300{{!}}tests: Make many things static for PHPUnit 10 (T420844)]], [[gerrit:1258301{{!}}phpunit.xml: Update configuration for PHPUnit 10 (T420844)]]
* 19:02 inflatador: bking@apt1002 `sudo -E reprepro -C component/opensearch2 include trixie-wikimedia ~/wmf-opensearch-search-plugins-2.19.5+3-trixie/wmf-opensearch-search-plugins_2.19.5+3_amd64.changes`
* 18:48 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 18:48 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1170: Degraded drive replaced [[phab:T420873|T420873]]
* 18:43 jasmine@cumin2002: START - Cookbook sre.hosts.provision for host wikikube-ctrl2004.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 18:36 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 18:35 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 18:25 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:24 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 18:20 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 18:20 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 18:13 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 18:11 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 18:07 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on phab2002.codfw.wmnet with reason: [[phab:T420228|T420228]]
* 18:01 aokoth@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 18:01 aokoth@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 18:01 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 18:00 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 18:00 mutante: codesearch9.codesearch - systemctl restart hound_proxy ([[phab:T421147|T421147]])
* 17:34 brett@cumin2002: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 17:30 brett@cumin2002: START - Cookbook sre.loadbalancer.admin rebooting P<nowiki>{</nowiki>lvs7003.magru.wmnet<nowiki>}</nowiki> and A:liberica
* 17:20 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 17:20 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 17:20 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 17:20 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 17:00 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:00 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:00 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:00 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 16:47 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 16:38 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1113.*
* 16:32 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1170: Degraded drive replaced [[phab:T420873|T420873]]
* 16:24 sfaci@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:24 sfaci@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 16:22 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1113.eqiad.wmnet with OS trixie
* 16:05 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 16:04 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 16:03 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 16:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 16:03 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 16:03 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 16:03 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 16:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1113.eqiad.wmnet with reason: host reimage
* 15:54 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1113.eqiad.wmnet with reason: host reimage
* 15:54 bjensen: Services portion of the datacenter switchover is complete
* 15:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2009.codfw.wmnet with OS trixie
* 15:46 blake@cumin1003: END (PASS) - Cookbook sre.discovery.datacenter (exit_code=0) depool all services in codfw: Datacenter Switchover - [[phab:T413974|T413974]]
* 15:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:38 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1113.eqiad.wmnet with OS trixie
* 15:38 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1113.eqiad.wmnet with OS trixie
* 15:36 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2009.codfw.wmnet with reason: host reimage
* 15:30 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2009.codfw.wmnet with reason: host reimage
* 15:20 blake@cumin1003: START - Cookbook sre.discovery.datacenter depool all services in codfw: Datacenter Switchover - [[phab:T413974|T413974]]
* 15:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1113.eqiad.wmnet with OS trixie
* 15:18 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2009.codfw.wmnet with OS trixie
* 15:16 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2008.codfw.wmnet with OS trixie
* 14:59 blake@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool codfw [reason: no reason specified, no task ID specified]
* 14:59 blake@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool codfw [reason: no reason specified, no task ID specified]
* 14:59 bjensen: beginning the Traffic and Services portions of the DC switchover, operational followup will be in #wikimedia-sre
* 14:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2008.codfw.wmnet with reason: host reimage
* 14:56 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2008.codfw.wmnet with reason: host reimage
* 14:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:50 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1009.eqiad.wmnet with OS trixie
* 14:44 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2008.codfw.wmnet with OS trixie
* 14:42 aokoth@dns1004: END - running authdns-update
* 14:41 aokoth@dns1004: START - running authdns-update
* 14:34 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1009.eqiad.wmnet with reason: host reimage
* 14:31 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2009.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:27 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1009.eqiad.wmnet with reason: host reimage
* 14:26 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2009.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:25 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:23 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:20 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2008.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:19 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:19 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:16 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1009.eqiad.wmnet with OS trixie
* 14:15 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2008.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:14 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox
* 14:13 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox
* 14:13 cmooney@cumin1003: END (PASS) - Cookbook sre.netbox.update-extras (exit_code=0) rolling restart_daemons on A:netbox-canary
* 14:13 cmooney@cumin1003: START - Cookbook sre.netbox.update-extras rolling restart_daemons on A:netbox-canary
* 14:12 dcausse@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259979{{!}}Revert "search: use the discovery ns record for the semanticsearch cluster"]] (duration: 06m 54s)
* 14:08 dcausse@deploy2002: dcausse: Continuing with sync
* 14:07 dcausse@deploy2002: dcausse: Backport for [[gerrit:1259979{{!}}Revert "search: use the discovery ns record for the semanticsearch cluster"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2007.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 14:05 dcausse@deploy2002: Started scap sync-world: Backport for [[gerrit:1259979{{!}}Revert "search: use the discovery ns record for the semanticsearch cluster"]]
* 14:04 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1008.eqiad.wmnet with OS trixie
* 14:01 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2007.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:59 jforrester@deploy2002: mwscript-k8s job started: sql --wiki=abstractwiki /srv/mediawiki/php-1.46.0-wmf.20/extensions/Translate/sql/mysql/translate_message_group_subscriptions.sql # [[phab:T420656|T420656]] translate_message_group_subscriptions
* 13:59 dcausse@deploy2002: Sync cancelled.
* 13:57 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:52 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker2006.mgmt.codfw.wmnet with chassis set policy GRACEFUL_RESTART
* 13:48 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1008.eqiad.wmnet with reason: host reimage
* 13:46 dcausse@deploy2002: dcausse: Backport for [[gerrit:1259875{{!}}search: use the discovery ns record for the semanticsearch cluster (T414484)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:44 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1008.eqiad.wmnet with reason: host reimage
* 13:44 dcausse@deploy2002: Started scap sync-world: Backport for [[gerrit:1259875{{!}}search: use the discovery ns record for the semanticsearch cluster (T414484)]]
* 13:33 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1008.eqiad.wmnet with OS trixie
* 13:32 sukhe: sudo cumin -b1 -s20 'C:bird' "run-puppet-agent --enable 'merging CR {{Gerrit|1248385}}, [[phab:T413740|T413740]]'"
* 13:30 cmelo@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259231{{!}}Enable the CampaignEvents extension on all wikibooks (T419597)]], [[gerrit:1259237{{!}}Enable $wgCampaignEventsEnableEventGoals in prod wikis (T414149)]] (duration: 12m 43s)
* 13:26 cmelo@deploy2002: cmelo, daimona: Continuing with sync
* 13:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1007.eqiad.wmnet with OS trixie
* 13:23 sukhe: sudo cumin 'C:bird' "disable-puppet 'merging CR {{Gerrit|1248385}}, [[phab:T413740|T413740]]'"
* 13:20 cmelo@deploy2002: cmelo, daimona: Backport for [[gerrit:1259231{{!}}Enable the CampaignEvents extension on all wikibooks (T419597)]], [[gerrit:1259237{{!}}Enable $wgCampaignEventsEnableEventGoals in prod wikis (T414149)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:18 cmelo@deploy2002: Started scap sync-world: Backport for [[gerrit:1259231{{!}}Enable the CampaignEvents extension on all wikibooks (T419597)]], [[gerrit:1259237{{!}}Enable $wgCampaignEventsEnableEventGoals in prod wikis (T414149)]]
* 13:08 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 13:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) payments1012.frack.eqiad.wmnet on all recursors
* 13:04 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache payments1012.frack.eqiad.wmnet on all recursors
* 13:04 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) payments1011.frack.eqiad.wmnet on all recursors
* 13:03 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache payments1011.frack.eqiad.wmnet on all recursors
* 13:03 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) payments1010.frack.eqiad.wmnet on all recursors
* 13:03 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache payments1010.frack.eqiad.wmnet on all recursors
* 13:02 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 13:00 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:00 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify records for payments servers frack - cmooney@cumin1003"
* 13:00 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: modify records for payments servers frack - cmooney@cumin1003"
* 12:56 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 12:50 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1007.eqiad.wmnet with OS trixie
* 12:02 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1017.eqiad.wmnet
* 12:02 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1017.eqiad.wmnet
* 12:01 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s1
* 11:53 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:53 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:51 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1017.eqiad.wmnet with reason: Rebooting clouddb1017 [[phab:T419960|T419960]]
* 11:51 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1017.eqiad.wmnet
* 11:51 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1017.eqiad.wmnet
* 11:51 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s1
* 11:49 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:49 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:46 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:46 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 11:36 volans@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin1001.eqiad.wmnet
* 11:32 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet,service=x3
* 11:32 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet,service=x3
* 11:32 volans@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcumin1001.eqiad.wmnet
* 11:31 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1023.eqiad.wmnet,service=x3
* 11:31 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=x3
* 11:31 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1023.eqiad.wmnet,service=x3
* 11:31 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1022.eqiad.wmnet,service=x3
* 11:27 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s3
* 11:27 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:27 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:26 fnegri@cumin1003: conftool action : set/weight=100; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1006.eqiad.wmnet with reason: host reimage
* 11:22 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:19 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:19 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:18 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1006.eqiad.wmnet with reason: host reimage
* 11:18 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1017.eqiad.wmnet,service=s3
* 11:17 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1017.eqiad.wmnet,service=s3
* 11:14 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1023.eqiad.wmnet,service=s3
* 11:14 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1022.eqiad.wmnet,service=s3
* 11:07 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 10:55 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:55 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:53 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2007.codfw.wmnet with OS trixie
* 10:49 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker2006.codfw.wmnet with OS trixie
* 10:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:36 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2007.codfw.wmnet with reason: host reimage
* 10:33 brouberol@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker2006.codfw.wmnet with reason: host reimage
* 10:30 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:29 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2007.codfw.wmnet with reason: host reimage
* 10:28 brouberol@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker2006.codfw.wmnet with reason: host reimage
* 10:22 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:21 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:20 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:20 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:18 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:17 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 10:17 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2007.codfw.wmnet with OS trixie
* 10:16 brouberol@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker2006.codfw.wmnet with OS trixie
* 10:07 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1006.eqiad.wmnet with OS trixie
* 09:47 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:46 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:43 ayounsi@cumin1003: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:34 ayounsi@cumin1003: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:31 ayounsi@cumin1003: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:31 ayounsi@cumin1003: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:29 ayounsi@cumin1003: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:29 ayounsi@cumin1003: START - Cookbook sre.ganeti.addnode for new host ganeti4008.ulsfo.wmnet to cluster ulsfo02 and group 01
* 09:23 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4008.ulsfo.wmnet with OS bookworm
* 09:05 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 09:01 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4008.ulsfo.wmnet with reason: host reimage
* 08:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 08:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old ulsfo ganeti VIP - ayounsi@cumin1003"
* 08:50 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: remove old ulsfo ganeti VIP - ayounsi@cumin1003"
* 08:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:46 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:45 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1170: Degraded drive [[phab:T420873|T420873]]
* 08:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:45 fceratto@cumin1003: START - Cookbook sre.mysql.depool depool db1170: Degraded drive [[phab:T420873|T420873]]
* 08:43 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:39 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host ganeti4008.ulsfo.wmnet with OS bookworm
* 08:31 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 08:31 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:29 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:27 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:27 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:25 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 08:24 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:13 hashar@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 07:59 hashar: Changed https://logstash.wikimedia.org/ default page back to /app/dashboards
* 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.18 (duration: 01m 13s)
* 03:42 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.21 refs [[phab:T420479|T420479]] (duration: 39m 27s)
* 03:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.21 refs [[phab:T420479|T420479]]
* 02:46 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 04s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:56 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 01:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp1104.*
* 01:37 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1104.eqiad.wmnet with OS trixie
* 01:14 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1104.eqiad.wmnet with reason: host reimage
* 01:08 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1104.eqiad.wmnet with reason: host reimage
* 00:52 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1104.eqiad.wmnet with OS trixie
* 00:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 00:18 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1115.eqiad.wmnet with OS trixie
== 2026-03-23 ==
* 22:51 rzl: root@apt1002:~# reprepro --noskipold --restrict vopsbot update bookworm-wikimedia
* 22:44 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 22:28 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host an-worker1172.eqiad.wmnet
* 22:25 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1104.eqiad.wmnet with OS trixie
* 22:07 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 22:05 maryum: Deployed security fix for [[phab:T415584|T415584]]
* 21:53 maryum: Deployed security fix for [[phab:T419192|T419192]]
* 21:41 maryum: Deployed security fix for [[phab:T419168|T419168]]
* 21:35 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 21:25 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255847{{!}}testwiki: Add temporary groups for security testing]] (duration: 12m 33s)
* 21:22 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 21:21 catrope@deploy2002: catrope: Continuing with sync
* 21:18 catrope@deploy2002: catrope: Backport for [[gerrit:1255847{{!}}testwiki: Add temporary groups for security testing]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:12 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1255847{{!}}testwiki: Add temporary groups for security testing]]
* 21:05 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1106.eqiad.wmnet [reason: trixie reimaging]
* 21:05 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1106.eqiad.wmnet [reason: trixie reimaging]
* 21:05 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1104.eqiad.wmnet with OS trixie
* 21:04 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1104.eqiad.wmnet [reason: trixie reimaging]
* 21:03 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1103.eqiad.wmnet [reason: trixie reimaging]
* 20:58 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256394{{!}}Abstract Wikipedia: Fix API call to get page info (T420725)]], [[gerrit:1259085{{!}}[abstractwiki] Enable the Translate extension (T420656)]], [[gerrit:1250113{{!}}Move testwiki-only Attribution REST API definition to IS]] (duration: 11m 12s)
* 20:54 jforrester@deploy2002: jforrester: Continuing with sync
* 20:53 jforrester@deploy2002: jforrester: Backport for [[gerrit:1256394{{!}}Abstract Wikipedia: Fix API call to get page info (T420725)]], [[gerrit:1259085{{!}}[abstractwiki] Enable the Translate extension (T420656)]], [[gerrit:1250113{{!}}Move testwiki-only Attribution REST API definition to IS]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1103.eqiad.wmnet with OS trixie
* 20:50 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy4002.wikimedia.org
* 20:50 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:50 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:47 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1256394{{!}}Abstract Wikipedia: Fix API call to get page info (T420725)]], [[gerrit:1259085{{!}}[abstractwiki] Enable the Translate extension (T420656)]], [[gerrit:1250113{{!}}Move testwiki-only Attribution REST API definition to IS]]
* 20:46 sukhe@cumin1003: START - Cookbook sre.dns.netbox
* 20:45 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 20:43 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1102.eqiad.wmnet [reason: trixie reimaging]
* {{safesubst:SAL entry|1=20:42 dani@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254448{{!}}Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450{{!}}Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452{{!}}Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763{{!}}testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120{{!}}Enable wgCampaignEventsEnableEventGoals in}}
* 20:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1102.eqiad.wmnet with OS trixie
* 20:41 sukhe@cumin1003: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy4002.wikimedia.org
* 20:40 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts hcaptcha-proxy4001.wikimedia.org
* 20:40 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:40 sukhe@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:39 sukhe@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: hcaptcha-proxy4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - sukhe@cumin1003"
* 20:37 dani@deploy2002: milimetric, daimona, dani: Continuing with sync
* {{safesubst:SAL entry|1=20:36 dani@deploy2002: milimetric, daimona, dani: Backport for [[gerrit:1254448{{!}}Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450{{!}}Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452{{!}}Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763{{!}}testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120{{!}}Enable wgCampaignEventsEnableEventGoals i}}
* 20:35 sukhe@cumin1003: START - Cookbook sre.dns.netbox
* {{safesubst:SAL entry|1=20:34 dani@deploy2002: Started scap sync-world: Backport for [[gerrit:1254448{{!}}Undeploy participant recruitment survey on ptwiki (T419275)]], [[gerrit:1254450{{!}}Undeploy participant recruitment survey on trwiki (T419275)]], [[gerrit:1254452{{!}}Undeploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1255763{{!}}testKitchen: Add custom stream name (T417050)]], [[gerrit:1259120{{!}}Enable wgCampaignEventsEnableEventGoals in}}
* 20:31 sukhe@cumin1003: START - Cookbook sre.hosts.decommission for hosts hcaptcha-proxy4001.wikimedia.org
* 20:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1103.eqiad.wmnet with reason: host reimage
* 20:24 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1103.eqiad.wmnet with reason: host reimage
* 20:23 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 20:19 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1102.eqiad.wmnet with reason: host reimage
* 20:17 alexsanford@deploy2002: Finished scap sync-world: Backport for [[gerrit:1256472{{!}}Reduce reauth timeout for editing site JS to 10 minutes (T419605)]] (duration: 07m 32s)
* 20:14 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1102.eqiad.wmnet with reason: host reimage
* 20:13 alexsanford@deploy2002: alexsanford: Continuing with sync
* 20:11 alexsanford@deploy2002: alexsanford: Backport for [[gerrit:1256472{{!}}Reduce reauth timeout for editing site JS to 10 minutes (T419605)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 alexsanford@deploy2002: Started scap sync-world: Backport for [[gerrit:1256472{{!}}Reduce reauth timeout for editing site JS to 10 minutes (T419605)]]
* 20:08 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1103.eqiad.wmnet with OS trixie
* 20:07 alexsanford: Deployed mitigation for [[phab:T419605|T419605]]
* 19:58 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 19:58 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 19:58 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 19:58 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1102.eqiad.wmnet with OS trixie
* 19:57 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 19:54 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 19:54 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 19:54 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 19:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy4004.wikimedia.org
* 19:51 cdobbins@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1102.eqiad.wmnet with OS trixie
* 19:50 cdobbins@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1103.eqiad.wmnet with OS trixie
* 19:50 ayounsi@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy4004.wikimedia.org
* 19:47 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 19:47 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha-proxy4003.wikimedia.org
* 19:46 ayounsi@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha-proxy4003.wikimedia.org
* 19:44 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 19:44 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>aqs[1011,1014,1016-1022]*<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 19:42 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1103.eqiad.wmnet with OS trixie
* 19:42 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1103.eqiad.wmnet [reason: trixie reimaging]
* 19:41 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1101.eqiad.wmnet [reason: trixie reimaging]
* 19:41 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1102.eqiad.wmnet with OS trixie
* 19:41 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1102.eqiad.wmnet [reason: trixie reimaging]
* 19:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1101.eqiad.wmnet with OS trixie
* 19:39 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1100.eqiad.wmnet [reason: trixie reimaging]
* 19:37 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp1100.eqiad.wmnet with OS trixie
* 19:30 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 19:18 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1101.eqiad.wmnet with reason: host reimage
* 19:14 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp1100.eqiad.wmnet with reason: host reimage
* 19:13 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1101.eqiad.wmnet with reason: host reimage
* 19:13 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:13 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:10 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp1100.eqiad.wmnet with reason: host reimage
* 18:59 inflatador: bking@deploy2002 restarting opensearch-semantic-search eqiad to renew certs
* 18:57 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1101.eqiad.wmnet with OS trixie
* 18:55 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1101.eqiad.wmnet with OS trixie
* 18:54 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1100.eqiad.wmnet with OS trixie
* 18:53 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1100.eqiad.wmnet with OS trixie
* 18:50 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 18:49 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 18:36 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on hcaptcha-proxy4002.wikimedia.org with reason: depooled host (soon to be decomed)
* 18:35 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on hcaptcha-proxy4001.wikimedia.org with reason: depooled host (soon to be decomed)
* 18:10 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 18:10 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp1115.eqiad.wmnet with OS trixie
* 18:05 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on P<nowiki>{</nowiki>aqs[1011,1014,1016-1022]*<nowiki>}</nowiki> and P<nowiki>{</nowiki>P:Cassandra<nowiki>}</nowiki>
* 17:54 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 17:54 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp1115.eqiad.wmnet with OS trixie
* 17:53 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase-eqiad
* 17:49 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259136{{!}}EventStreamConfig: Document not adding performer attributes to SI interaction v2 stream (T418740)]] (duration: 06m 28s)
* 17:45 dreamyjazz@deploy2002: kharlan, dreamyjazz: Continuing with sync
* 17:45 dreamyjazz@deploy2002: kharlan, dreamyjazz: Backport for [[gerrit:1259136{{!}}EventStreamConfig: Document not adding performer attributes to SI interaction v2 stream (T418740)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:43 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1259136{{!}}EventStreamConfig: Document not adding performer attributes to SI interaction v2 stream (T418740)]]
* 17:35 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:34 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1101.eqiad.wmnet with OS trixie
* 17:34 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1101.eqiad.wmnet [reason: trixie reimaging]
* 17:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:31 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp1100.eqiad.wmnet with OS trixie
* 17:30 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1100.eqiad.wmnet [reason: trixie reimaging]
* 17:27 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:26 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:24 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:23 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:22 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:21 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 17:21 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:20 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp1115.eqiad.wmnet with OS trixie
* 17:20 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:18 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:17 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:17 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:16 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1009.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 17:13 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:13 bd808@deploy2002: Finished deploy [releng/jenkins-deploy@f47af21] (releasing): jobs: Use TZ=UTC in branchMWSingleVersion.groovy trigger ([[phab:T404399|T404399]]) (duration: 01m 36s)
* 17:12 bd808@deploy2002: Started deploy [releng/jenkins-deploy@f47af21] (releasing): jobs: Use TZ=UTC in branchMWSingleVersion.groovy trigger ([[phab:T404399|T404399]])
* 17:12 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:09 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:08 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:08 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:08 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:07 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:06 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:04 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:04 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:03 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:02 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 17:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1001.eqiad.wmnet
* 16:56 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1008.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:56 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 14 hosts
* 16:55 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:55 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for 14 hosts
* 16:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1001.eqiad.wmnet
* 16:53 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:52 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:52 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1007.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:46 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:41 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:38 volans@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcumin2001.codfw.wmnet
* 16:35 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 16:34 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 16:34 volans@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcumin2001.codfw.wmnet
* 16:32 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 16:31 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 16:30 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 16:29 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1023.eqiad.wmnet
* 16:29 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1023.eqiad.wmnet
* 16:28 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:27 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:24 eevans@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for aqs1010.eqiad.wmnet
* 16:24 eevans@cumin1003: START - Cookbook sre.hosts.remove-downtime for aqs1010.eqiad.wmnet
* 16:21 jgreen@dns1004: END - running authdns-update
* 16:19 jgreen@dns1004: START - running authdns-update
* 16:18 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:17 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:11 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Rebooting clouddb1023 [[phab:T419960|T419960]]
* 16:09 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1025.eqiad.wmnet
* 16:09 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1025.eqiad.wmnet
* 16:09 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1025.eqiad.wmnet
* 16:04 urandom: stopping aqs1010 for SSD replacement — [[phab:T420867|T420867]]
* 16:03 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 16:03 eevans@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on aqs1010.eqiad.wmnet with reason: Shutting down for SSD replacement — [[phab:T420867|T420867]]
* 15:58 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1025.eqiad.wmnet
* 15:57 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1025.eqiad.wmnet with reason: Rebooting clouddb1025 [[phab:T419960|T419960]]
* 15:57 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1023.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:56 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:53 topranks: disabling puppet for nftables-enabled machines to validate new ruleset on selected hosts before wider rollout [[phab:T420715|T420715]]
* 15:50 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1022.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:50 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:49 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1021.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:31 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1172.eqiad.wmnet
* 15:21 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 15:20 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 15:15 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1020.eqiad.wmnet
* 15:14 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1020.eqiad.wmnet
* 15:14 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet
* 15:05 btullis@cumin1003: END (PASS) - Cookbook sre.hadoop.init-hadoop-workers (exit_code=0) for hosts an-worker1172.eqiad.wmnet
* 15:03 btullis@cumin1003: START - Cookbook sre.hadoop.init-hadoop-workers for hosts an-worker1172.eqiad.wmnet
* 15:03 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) opensearch-ipoid.discovery.wmnet on all recursors
* 15:03 bking@cumin2002: START - Cookbook sre.dns.wipe-cache opensearch-ipoid.discovery.wmnet on all recursors
* 15:03 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 15:01 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 15:01 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 14:59 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) opensearch-ipoid.discovery.wmnet on all recursors
* 14:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache opensearch-ipoid.discovery.wmnet on all recursors
* 14:58 sukhe@dns1004: END - running authdns-update
* 14:58 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) opensearch-test.discovery.wmnet on all recursors
* 14:58 bking@cumin2002: START - Cookbook sre.dns.wipe-cache opensearch-test.discovery.wmnet on all recursors
* 14:57 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet
* 14:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:56 sukhe@dns1004: START - running authdns-update
* 14:56 sukhe@dns1004: END - running authdns-update
* 14:56 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1020.eqiad.wmnet with reason: Rebooting clouddb1020 [[phab:T419960|T419960]]
* 14:56 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1019.eqiad.wmnet
* 14:56 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1019.eqiad.wmnet
* 14:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:55 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1019.eqiad.wmnet
* 14:55 sukhe@dns1004: START - running authdns-update
* 14:55 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase-eqiad
* 14:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:49 sukhe@dns1004: END - running authdns-update
* 14:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:48 sukhe@dns1004: START - running authdns-update
* 14:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1003.eqiad.wmnet
* 14:45 bking@cumin2002: conftool action : set/pooled=false; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 14:44 sukhe@dns1004: END - running authdns-update
* 14:43 sukhe@dns1004: START - running authdns-update
* 14:40 sukhe@dns1004: FAIL - running authdns-update
* 14:39 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1003.eqiad.wmnet
* 14:38 sukhe@dns1004: START - running authdns-update
* 14:37 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=eqiad
* 14:36 bking@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) k8s-ingress-dse-aa.discovery.wmnet on all recursors
* 14:36 bking@cumin2002: START - Cookbook sre.dns.wipe-cache k8s-ingress-dse-aa.discovery.wmnet on all recursors
* 14:34 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1019.eqiad.wmnet
* 14:34 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1019.eqiad.wmnet with reason: Rebooting clouddb1019 [[phab:T419960|T419960]]
* 14:33 sukhe@dns1004: FAIL - running authdns-update
* 14:33 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1002.eqiad.wmnet
* 14:33 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1018.eqiad.wmnet
* 14:33 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1018.eqiad.wmnet
* 14:32 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1018.eqiad.wmnet
* 14:32 sukhe@dns1004: START - running authdns-update
* 14:31 bking@cumin2002: conftool action : set/pooled=true; selector: dnsdisc=k8s-ingress-dse-aa,name=codfw
* 14:30 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host an-worker1172.eqiad.wmnet with OS bullseye
* 14:30 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:27 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1002.eqiad.wmnet
* 14:22 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1018.eqiad.wmnet with reason: Rebooting clouddb1018 [[phab:T419960|T419960]]
* 14:22 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1018.eqiad.wmnet
* 14:22 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1018.eqiad.wmnet
* 14:21 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1018.eqiad.wmnet
* 14:20 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:17 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit1001.eqiad.wmnet
* 14:14 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 14:14 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:14 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:13 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on db1253.eqiad.wmnet with reason: Under repair
* 14:11 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudrabbit1001.eqiad.wmnet
* 14:07 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:04 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha2002.wikimedia.org
* 14:04 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2352-2356].codfw.wmnet
* 14:03 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2347-2351].codfw.wmnet
* 14:03 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2347-2351].codfw.wmnet
* 14:00 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha2002.wikimedia.org
* 14:00 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha2001.wikimedia.org
* 13:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on an-worker1172.eqiad.wmnet with reason: host reimage
* 13:57 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2347-2351].codfw.wmnet
* 13:56 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha2001.wikimedia.org
* 13:56 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:55 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha1002.wikimedia.org
* 13:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on an-worker1172.eqiad.wmnet with reason: host reimage
* 13:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2347-2351].codfw.wmnet
* 13:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:52 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:51 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha1002.wikimedia.org
* 13:51 kamila@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host hcaptcha1001.wikimedia.org
* 13:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:47 kamila@cumin1003: START - Cookbook sre.hosts.reboot-single for host hcaptcha1001.wikimedia.org
* 13:47 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:43 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2342-2346].codfw.wmnet
* 13:43 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:43 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:42 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host an-worker1172.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:42 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 13:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2012.codfw.wmnet
* 13:41 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rdb2011.codfw.wmnet
* 13:38 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb1002.eqiad.wmnet
* 13:36 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2012.codfw.wmnet
* 13:36 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host rdb2011.codfw.wmnet
* 13:36 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2337-2341].codfw.wmnet
* 13:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:30 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy2003.codfw.wmnet
* 13:30 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1002.eqiad.wmnet
* 13:29 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudlb1002.eqiad.wmnet
* 13:29 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1002.eqiad.wmnet
* 13:28 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb1001.eqiad.wmnet
* 13:25 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:24 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy2003.codfw.wmnet
* 13:21 jforrester@deploy2002: mwscript-k8s job started: extensions/WikimediaMaintenance/maintenance/createExtensionTables.php --wiki=abstractwiki translate # [[phab:T420656|T420656]]
* 13:21 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2336].codfw.wmnet
* 13:21 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 13:20 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1001.eqiad.wmnet
* 13:20 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudlb1001.eqiad.wmnet
* 13:20 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1001.eqiad.wmnet
* 13:19 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:19 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host cloudlb1001.eqiad.wmnet
* 13:18 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudlb1001.eqiad.wmnet
* 13:17 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1259035{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added (T420722)]], [[gerrit:1259036{{!}}tests: add coverage for WelcomeSurveyHooks::onCentralAuthPostLoginRedirect (T420722)]], [[gerrit:1259046{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added 2 (T420722)]] (duration: 11m 43s)
* 13:16 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:14 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:13 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:11 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:11 sgimeno@deploy2002: sgimeno: Continuing with sync
* 13:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2005-2006,2011-2018,2033-2039,2041-2042,2044,2046,2049-2051,2055-2062,2064-2065,2067-2078,2087-2095,2102-2115,2124-2179,2184-2199].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 13:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2186-2199].codfw.wmnet
* 13:08 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2186-2199].codfw.wmnet
* 13:07 sgimeno@deploy2002: sgimeno: Backport for [[gerrit:1259035{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added (T420722)]], [[gerrit:1259036{{!}}tests: add coverage for WelcomeSurveyHooks::onCentralAuthPostLoginRedirect (T420722)]], [[gerrit:1259046{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added 2 (T420722)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Ch
* 13:05 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1259035{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added (T420722)]], [[gerrit:1259036{{!}}tests: add coverage for WelcomeSurveyHooks::onCentralAuthPostLoginRedirect (T420722)]], [[gerrit:1259046{{!}}fix(WelcomeSurveyHooks): ensure accountJustCreated is always added 2 (T420722)]]
* 12:43 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "bast4006 - ayounsi@cumin1003"
* 12:42 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "bast4006 - ayounsi@cumin1003"
* 12:42 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host testvm2006.codfw.wmnet with OS bookworm
* 12:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host bast4006.wikimedia.org
* 12:38 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast4006.wikimedia.org with OS bookworm
* 12:34 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:28 cmooney@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
* 12:22 cmooney@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on testvm2006.codfw.wmnet with reason: host reimage
* 12:18 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 12:16 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2186-2199].codfw.wmnet
* 12:14 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on bast4006.wikimedia.org with reason: host reimage
* 12:08 cmooney@cumin1003: START - Cookbook sre.hosts.reimage for host testvm2006.codfw.wmnet with OS bookworm
* 12:07 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2186-2199].codfw.wmnet
* 12:04 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 12:04 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 11:52 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 11:44 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2168-2179,2184-2185].codfw.wmnet
* 11:40 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:40 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:31 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:23 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2154-2167].codfw.wmnet
* 11:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2140-2153].codfw.wmnet
* 11:20 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host bast4006.wikimedia.org with OS bookworm
* 11:20 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2140-2153].codfw.wmnet
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:19 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) bast4006.wikimedia.org on all recursors
* 11:19 ayounsi@cumin1003: START - Cookbook sre.dns.wipe-cache bast4006.wikimedia.org on all recursors
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:19 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:19 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM bast4006.wikimedia.org - ayounsi@cumin1003"
* 11:15 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 11:15 ayounsi@cumin1003: START - Cookbook sre.ganeti.makevm for new host bast4006.wikimedia.org
* 11:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts install4003.wikimedia.org
* 11:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:09 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install4003.wikimedia.org decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
* 11:08 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: install4003.wikimedia.org decommissioned, removing all IPs except the asset tag one - ayounsi@cumin1003"
* 11:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2140-2153].codfw.wmnet
* 11:05 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 11:00 ayounsi@cumin1003: START - Cookbook sre.hosts.decommission for hosts install4003.wikimedia.org
* 10:57 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 10:55 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2140-2153].codfw.wmnet
* 10:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:55 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:44 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:43 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:43 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:38 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 10:38 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:38 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 10:38 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 10:31 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2126-2139].codfw.wmnet
* 10:30 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:30 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:28 topranks: disable puppet on routed-ganeti hosts to test nftables update on specific nodes [[phab:T420715|T420715]]
* 10:27 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s1
* 10:25 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s1
* 10:25 ayounsi@dns1004: END - running authdns-update
* 10:24 ayounsi@dns1004: START - running authdns-update
* 10:23 btullis@cumin1003: START - Cookbook sre.hosts.provision for host an-worker1172.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 10:22 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:20 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s4
* 10:18 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s4
* 10:13 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s8
* 10:11 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s8
* 10:09 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1172.eqiad.wmnet with OS bullseye
* 10:08 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2104-2115,2124-2125].codfw.wmnet
* 10:05 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s7
* 10:05 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 10:04 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 10:04 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s7
* 09:58 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s3
* 09:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:57 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s3
* 09:53 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:53 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:53 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 09:52 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s2
* 09:49 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s2
* 09:49 blake@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 09:49 blake@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 09:49 blake@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 09:49 blake@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 09:44 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2076-2078,2087-2095,2102-2103].codfw.wmnet
* 09:44 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s5
* 09:44 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 09:44 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 09:42 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s5
* 09:40 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:39 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:33 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section s6
* 09:32 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section s6
* 09:29 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:29 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:25 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 09:25 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 09:24 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section es7
* 09:23 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section es7
* 09:22 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 09:22 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 09:17 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section es6
* 09:16 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section es6
* 09:11 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:11 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:10 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section x3
* 09:09 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section x3
* 09:08 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 09:05 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:05 elukey@cumin1003: START - Cookbook sre.hosts.provision for host aux-k8s-worker1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 09:02 fceratto@cumin1003: END (PASS) - Cookbook sre.switchdc.databases.prepare (exit_code=0) for the switch from codfw to eqiad for section x1
* 09:01 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from codfw to eqiad for section x1
* 09:00 federico3: starting [[phab:T416706|T416706]]
* 09:00 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2060-2062,2064-2065,2067-2075].codfw.wmnet
* 08:59 fceratto@cumin1003: END (FAIL) - Cookbook sre.switchdc.databases.prepare (exit_code=99) for the switch from eqiad to codfw for section test-s4
* 08:59 fceratto@cumin1003: START - Cookbook sre.switchdc.databases.prepare for the switch from eqiad to codfw for section test-s4
* 08:59 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:59 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:50 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:46 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255736{{!}}hcaptcha: Use the global edit key for MobileFrontend edits if present (T420574)]] (duration: 14m 42s)
* 08:44 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 08:43 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 08:41 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2038-2039,2041-2042,2044,2046,2049-2051,2055-2059].codfw.wmnet
* 08:40 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:40 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:40 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 08:39 kharlan@deploy2002: kharlan: Continuing with sync
* 08:38 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 08:37 kharlan@deploy2002: kharlan: Backport for [[gerrit:1255736{{!}}hcaptcha: Use the global edit key for MobileFrontend edits if present (T420574)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 08:31 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1255736{{!}}hcaptcha: Use the global edit key for MobileFrontend edits if present (T420574)]]
* 08:29 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:19 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2005-2006,2011-2018,2033-2037].codfw.wmnet
* 08:18 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2005-2006,2011-2018,2033-2039,2041-2042,2044,2046,2049-2051,2055-2062,2064-2065,2067-2078,2087-2095,2102-2115,2124-2179,2184-2199].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 07:45 kartik@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254149{{!}}Enable ULS rewrite beta feature (T418187 T253303)]] (duration: 41m 30s)
* 07:42 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:33 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:30 kartik@deploy2002: kartik, abi: Continuing with sync
* 07:30 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:29 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 07:22 kartik@deploy2002: kartik, abi: Backport for [[gerrit:1254149{{!}}Enable ULS rewrite beta feature (T418187 T253303)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 07:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 07:17 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 07:16 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 07:03 kartik@deploy2002: Started scap sync-world: Backport for [[gerrit:1254149{{!}}Enable ULS rewrite beta feature (T418187 T253303)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 55s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-22 ==
* 02:50 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh7004.wikimedia.org with reason: depooled host
* 02:50 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh7003.wikimedia.org with reason: depooled host
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 21s)
* 02:01 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-20 ==
* 23:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for lvs2013.codfw.wmnet
* 23:30 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for lvs2013.codfw.wmnet
* 22:34 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host lvs2013.codfw.wmnet
* 22:34 brett: Started pybal on lvs2013
* 22:27 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 21:57 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5023.eqsin.wmnet [reason: trixie reimaging]
* 21:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5023.eqsin.wmnet with OS trixie
* 21:55 hashar: Upgrading CI Jenkins [[phab:T420477|T420477]]
* 21:25 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5023.eqsin.wmnet with reason: host reimage
* 21:21 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5023.eqsin.wmnet with reason: host reimage
* 21:04 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: debugging ipip
* 20:46 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5023.eqsin.wmnet with OS trixie
* 20:45 mutante: contint1003/2003 apt remove --purge apache2* ; apt remove --purge php* {{!}} [[phab:T418521|T418521]]
* 20:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 20:40 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 20:38 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5023.eqsin.wmnet with OS trixie
* 20:24 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh3006.wikimedia.org with reason: depooled host
* 20:24 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3 days, 0:00:00 on doh3005.wikimedia.org with reason: depooled host
* 20:23 sukhe@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 1 day, 0:00:00 on doh3005.wikimedia.org with reason: depooled host
* 19:50 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: debugging ipip
* 19:33 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs2013.codfw.wmnet
* 19:30 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2013.codfw.wmnet
* 19:21 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy (exit_code=0) rolling reboot on A:tcpproxy and A:tcpproxy
* 19:16 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5023.eqsin.wmnet with OS trixie
* 19:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5023.eqsin.wmnet [reason: trixie reimaging]
* 19:16 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5021.eqsin.wmnet [reason: trixie reimaging]
* 19:14 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5021.eqsin.wmnet with OS trixie
* 18:52 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: reboot
* 18:43 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5021.eqsin.wmnet with reason: host reimage
* 18:39 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5021.eqsin.wmnet with reason: host reimage
* 18:28 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs2013.codfw.wmnet with reason: reboot
* 18:16 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-tcp-proxy rolling reboot on A:tcpproxy and A:tcpproxy
* 18:14 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on db1253.eqiad.wmnet with reason: [[phab:T420041|T420041]]
* 17:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5021.eqsin.wmnet with OS trixie
* 17:54 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5021.eqsin.wmnet with OS trixie
* 17:51 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host lvs2014.codfw.wmnet
* 17:40 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4 days, 0:00:00 on contint1003.wikimedia.org with reason: jenkins on java21
* 17:39 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs2014.codfw.wmnet
* 16:54 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:54 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:33 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5021.eqsin.wmnet with OS trixie
* 16:32 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5021.eqsin.wmnet [reason: trixie reimaging]
* 16:09 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host an-worker1172.eqiad.wmnet with OS bullseye
* 16:08 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2002.codfw.wmnet
* 16:02 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2002.codfw.wmnet
* 15:51 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2001.codfw.wmnet
* 15:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:45 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2001.codfw.wmnet
* 15:43 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2041.codfw.wmnet
* 15:37 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2041.codfw.wmnet
* 15:32 cparle@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 15:32 cparle@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 15:29 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 15:16 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2040.codfw.wmnet
* 15:10 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2040.codfw.wmnet
* 15:02 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 15:01 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 15:00 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:59 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:58 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:58 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:57 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:56 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:55 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:50 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:45 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2002].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2002.codfw.wmnet
* 14:45 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2002.codfw.wmnet
* 14:44 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:44 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2002.codfw.wmnet
* 14:37 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2002.codfw.wmnet
* 14:37 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2001.codfw.wmnet
* 14:37 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2001.codfw.wmnet
* 14:36 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:35 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:35 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:34 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:30 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2001.codfw.wmnet
* 14:29 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2001.codfw.wmnet
* 14:29 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2001-2002].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 14:27 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1335-1349].eqiad.wmnet
* 14:27 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1335-1349].eqiad.wmnet
* 14:26 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2039.codfw.wmnet
* 14:21 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2039.codfw.wmnet
* 14:16 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/wdqs-queryhammer: apply
* 14:16 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/wdqs-queryhammer: apply
* 14:14 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc2038.codfw.wmnet
* 14:08 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc2038.codfw.wmnet
* 13:54 jgreen@dns1004: END - running authdns-update
* 13:52 jgreen@dns1004: START - running authdns-update
* 13:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:39 inflatador: bking@deploy2002 restarting opensearch-ipoid cluster to apply new certificates
* 13:33 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 13:20 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:14 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-canary
* 13:14 sukhe@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for doh[3005-3006].wikimedia.org
* 13:14 sukhe@cumin1003: START - Cookbook sre.hosts.remove-downtime for doh[3005-3006].wikimedia.org
* 13:08 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-canary
* 13:05 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:03 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:58 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2006.codfw.wmnet
* 12:56 cparle@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 12:55 cparle@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 12:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2006.codfw.wmnet
* 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2001.codfw.wmnet
* 12:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2001.codfw.wmnet
* 12:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1005.eqiad.wmnet
* 12:35 jiji@cumin1003: END (ERROR) - Cookbook sre.memcached.roll-reboot-restart (exit_code=97) rolling reboot on A:memcached-codfw
* 12:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1005.eqiad.wmnet
* 12:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw2001.wikimedia.org
* 12:28 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw2001.wikimedia.org
* 12:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-rw1001.wikimedia.org
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-rw1001.wikimedia.org
* 11:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 11:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 11:27 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 11:24 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 10:26 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 10:13 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:12 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:10 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:04 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:02 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:58 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:56 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:56 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:55 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:53 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:50 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:46 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:46 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:45 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:45 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:45 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:37 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:37 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:37 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:36 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:36 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:35 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:35 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:34 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:34 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:34 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:33 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:26 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:25 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:24 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:23 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:20 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:19 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:19 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:18 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.print-network-topology (exit_code=0)
* 09:18 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:18 jayme@cumin1003: START - Cookbook sre.k8s.print-network-topology
* 09:17 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host an-worker1172.eqiad.wmnet with OS bullseye
* 09:15 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 08:57 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 05:30 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5024.eqsin.wmnet [reason: trixie reimaging]
* 05:30 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5019.eqsin.wmnet [reason: trixie reimaging]
* 02:43 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on doh3005.wikimedia.org with reason: alerting is flapping
* 02:42 sukhe@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on doh3006.wikimedia.org with reason: alerting is flapping
* 01:21 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5019.eqsin.wmnet with OS trixie
* 01:17 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5024.eqsin.wmnet with OS trixie
* 00:48 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
* 00:44 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 00:38 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5019.eqsin.wmnet with reason: host reimage
* 00:37 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5024.eqsin.wmnet with reason: host reimage
* 00:05 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
* 00:01 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5024.eqsin.wmnet with OS trixie
* 00:01 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
== 2026-03-19 ==
* 23:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5019.eqsin.wmnet with OS trixie
* 23:40 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255801{{!}}Make the handler follow the thumb steps (T414805)]] (duration: 06m 14s)
* 23:36 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 23:35 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1255801{{!}}Make the handler follow the thumb steps (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:33 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1255801{{!}}Make the handler follow the thumb steps (T414805)]]
* 22:48 zabe@deploy2002: mwscript-k8s job started: foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https # [[phab:T420643|T420643]]
* 22:19 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 22:18 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 22:17 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 22:16 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 22:08 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255886{{!}}Set WikiLambdaAbstractNamespaces's merge_strategy to provide_default (T420649)]] (duration: 06m 46s)
* 22:04 jforrester@deploy2002: jforrester: Continuing with sync
* 22:03 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255886{{!}}Set WikiLambdaAbstractNamespaces's merge_strategy to provide_default (T420649)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:01 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255886{{!}}Set WikiLambdaAbstractNamespaces's merge_strategy to provide_default (T420649)]]
* 21:58 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5019.eqsin.wmnet with OS trixie
* 21:57 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:restbase-codfw
* 21:57 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5019.eqsin.wmnet [reason: trixie reimaging]
* 21:56 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5024.eqsin.wmnet with OS trixie
* 21:56 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5024.eqsin.wmnet [reason: trixie reimaging]
* 21:55 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255765{{!}}Implement addListener fallback for older browsers in matchMedia (T419717)]] (duration: 07m 17s)
* 21:51 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:49 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1255765{{!}}Implement addListener fallback for older browsers in matchMedia (T419717)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:48 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255765{{!}}Implement addListener fallback for older browsers in matchMedia (T419717)]]
* 21:29 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255881{{!}}Skins: Address issue with blurry images for large thumbnails (T375981)]] (duration: 07m 03s)
* 21:25 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 21:24 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1255881{{!}}Skins: Address issue with blurry images for large thumbnails (T375981)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:22 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255881{{!}}Skins: Address issue with blurry images for large thumbnails (T375981)]]
* 21:11 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backup2020.codfw.wmnet with reason: kernel module reload
* 21:10 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 11 hosts with reason: kernel module reload
* 20:36 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254865{{!}}Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654{{!}}Deploy PRV to 13 wikis (T420273)]] (duration: 11m 00s)
* 20:32 kgraessle@deploy2002: kgraessle, arlolra: Continuing with sync
* 20:27 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1016.eqiad.wmnet
* 20:27 kgraessle@deploy2002: kgraessle, arlolra: Backport for [[gerrit:1254865{{!}}Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654{{!}}Deploy PRV to 13 wikis (T420273)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1254865{{!}}Deploy Extension:PersonalDashboard to English Wikipedia (T418367)]], [[gerrit:1253654{{!}}Deploy PRV to 13 wikis (T420273)]]
* 20:25 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1016.eqiad.wmnet
* 20:11 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs1016.eqiad.wmnet with reason: reboot
* 20:01 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:01 cmooney@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add analytic vlan hostnames - cmooney@cumin1003"
* 20:01 cmooney@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add analytic vlan hostnames - cmooney@cumin1003"
* 19:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1018.eqiad.wmnet
* 19:56 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:56 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1018.eqiad.wmnet
* 19:55 cmooney@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:53 cmooney@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) 4.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.8.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa. on all recursors
* 19:53 cmooney@cumin1003: START - Cookbook sre.dns.wipe-cache 4.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.8.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa. on all recursors
* 19:52 cmooney@cumin1003: START - Cookbook sre.dns.netbox
* 19:51 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 7 hosts with reason: kernel module reload
* 19:44 topranks: disable IPv6 VRRP for et-1/0/5.1023 sub-interfaces on eqiad core routers [[phab:T405562|T405562]]
* 19:36 brett: stopping pybal/puppet on lvs1018 for reboots
* 19:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on lvs1018.eqiad.wmnet with reason: reboots
* 19:00 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 6 hosts with reason: kernel module reload
* 19:00 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1019.eqiad.wmnet
* 19:00 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:restbase-codfw
* 19:00 topranks: add vlan sub-interface for analytics1-d-eqiad vlan to leaf switches in eqiad row d [[phab:T405562|T405562]]
* 18:44 cdobbins@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on lvs1019.eqiad.wmnet with reason: planned reboot
* 18:42 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:aqs-codfw
* 18:31 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255824{{!}}RepoBooks::onMediaWikiServices: Skip all low NSes, not just NS0 (T420617)]], [[gerrit:1255820{{!}}SpecialAbstractContent: Fix hard-coded policy list page namespace]], [[gerrit:1255794{{!}}[abstractwiki] Allow "Abstract:" as well as "Abstract Wikipedia:" as NS_PROJECT]] (duration: 06m 20s)
* 18:27 jforrester@deploy2002: jforrester: Continuing with sync
* 18:26 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255824{{!}}RepoBooks::onMediaWikiServices: Skip all low NSes, not just NS0 (T420617)]], [[gerrit:1255820{{!}}SpecialAbstractContent: Fix hard-coded policy list page namespace]], [[gerrit:1255794{{!}}[abstractwiki] Allow "Abstract:" as well as "Abstract Wikipedia:" as NS_PROJECT]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now b
* 18:24 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255824{{!}}RepoBooks::onMediaWikiServices: Skip all low NSes, not just NS0 (T420617)]], [[gerrit:1255820{{!}}SpecialAbstractContent: Fix hard-coded policy list page namespace]], [[gerrit:1255794{{!}}[abstractwiki] Allow "Abstract:" as well as "Abstract Wikipedia:" as NS_PROJECT]]
* 18:02 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lvs1020.eqiad.wmnet
* 17:55 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 17:46 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:46 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:45 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host lvs1020.eqiad.wmnet
* 17:44 cdobbins@cumin2002: START - Cookbook sre.hosts.reboot-single for host lvs1020.eqiad.wmnet
* 17:30 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5026.*
* 17:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy4004.wikimedia.org
* 17:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy4004.wikimedia.org with OS bookworm
* 17:26 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on contint1003.wikimedia.org with reason: jenkins on java21
* 17:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp5026.eqsin.wmnet
* 17:22 bd808@deploy2002: helmfile [codfw] DONE helmfile.d/services/developer-portal: apply
* 17:22 bd808@deploy2002: helmfile [codfw] START helmfile.d/services/developer-portal: apply
* 17:22 bd808@deploy2002: helmfile [eqiad] DONE helmfile.d/services/developer-portal: apply
* 17:22 bd808@deploy2002: helmfile [eqiad] START helmfile.d/services/developer-portal: apply
* 17:21 bd808@deploy2002: helmfile [staging] DONE helmfile.d/services/developer-portal: apply
* 17:18 bd808@deploy2002: helmfile [staging] START helmfile.d/services/developer-portal: apply
* 17:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp5026.eqsin.wmnet
* 17:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy4004.wikimedia.org with reason: host reimage
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts doh4002.wikimedia.org
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:08 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 17:08 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy4004.wikimedia.org with reason: host reimage
* 17:08 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4002.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 17:07 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:07 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:05 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp5026.eqsin.wmnet with reason: firmware updates
* 17:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 17:03 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5025.*
* 17:01 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp5025.eqsin.wmnet
* 16:59 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts doh4002.wikimedia.org
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts doh4001.wikimedia.org
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 16:59 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: doh4001.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 16:58 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:57 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:57 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:56 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 16:54 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:52 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp5025.eqsin.wmnet
* 16:50 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts doh4001.wikimedia.org
* 16:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-master1004.eqiad.wmnet
* 16:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1151.eqiad.wmnet
* 16:45 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5026.eqsin.wmnet with OS trixie
* 16:44 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy4004.wikimedia.org with OS bookworm
* 16:44 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy4004.wikimedia.org on all recursors
* 16:43 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy4004.wikimedia.org on all recursors
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:42 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4004.wikimedia.org - jmm@cumin2002"
* 16:42 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255039{{!}}Activate Abstract Wikipedia (T411723)]] (duration: 06m 09s)
* 16:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp5025.eqsin.wmnet with reason: firmware updates
* 16:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-master1004.eqiad.wmnet
* 16:40 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5025.eqsin.wmnet with OS trixie
* 16:39 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1151.eqiad.wmnet
* 16:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:39 jmm@cumin2002: END (ERROR) - Cookbook sre.dns.netbox (exit_code=97)
* 16:38 jforrester@deploy2002: jforrester: Continuing with sync
* 16:38 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255039{{!}}Activate Abstract Wikipedia (T411723)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:36 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255039{{!}}Activate Abstract Wikipedia (T411723)]]
* 16:35 jmm@cumin2002: END (PASS) - Cookbook sre.netbox.restart-reboot (exit_code=0) rolling reboot on A:netbox
* 16:33 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255779{{!}}Revert "[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki"]] (duration: 07m 19s)
* 16:29 jforrester@deploy2002: jforrester: Continuing with sync
* 16:28 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255779{{!}}Revert "[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:26 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255779{{!}}Revert "[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki"]]
* 16:25 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:24 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255773{{!}}[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki (T420531)]] (duration: 06m 06s)
* 16:23 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:aqs-codfw
* 16:21 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 16:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 16:20 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:20 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4004.wikimedia.org
* 16:20 fabfur@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=1) rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp2041*<nowiki>}</nowiki> and A:cp - 3.2 test upgrade ()
* 16:20 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp2041*<nowiki>}</nowiki> and A:cp - 3.2 test upgrade ()
* 16:20 jforrester@deploy2002: jforrester: Continuing with sync
* 16:19 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255773{{!}}[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki (T420531)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2003.codfw.wmnet
* 16:17 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255773{{!}}[abstractwiki] Temporarily disable wgWikiLambdaEnableAbstractMode to see if this means we can create the wiki (T420531)]]
* 16:17 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netbox.discovery.wmnet. on all recursors
* 16:17 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netbox.discovery.wmnet. on all recursors
* 16:17 jmm@cumin2002: START - Cookbook sre.netbox.restart-reboot rolling reboot on A:netbox
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host hcaptcha-proxy4003.wikimedia.org
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host hcaptcha-proxy4003.wikimedia.org with OS bookworm
* 16:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1142.eqiad.wmnet
* 16:14 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk2003.codfw.wmnet
* 16:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2002.codfw.wmnet
* 16:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 16:10 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk2002.codfw.wmnet
* 16:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5025.eqsin.wmnet with reason: host reimage
* 16:08 brouberol@deploy2002: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 16:07 brouberol@deploy2002: helmfile [codfw] START helmfile.d/admin 'apply'.
* 16:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1142.eqiad.wmnet
* 16:06 brouberol@deploy2002: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 16:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk2001.codfw.wmnet
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5026.eqsin.wmnet with reason: host reimage
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5025.eqsin.wmnet with reason: host reimage
* 16:05 brouberol@deploy2002: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 16:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk2001.codfw.wmnet
* 15:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on hcaptcha-proxy4003.wikimedia.org with reason: host reimage
* 15:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on hcaptcha-proxy4003.wikimedia.org with reason: host reimage
* 15:36 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief2002.codfw.wmnet
* 15:35 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5025.eqsin.wmnet with OS trixie
* 15:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5026.eqsin.wmnet with OS trixie
* 15:34 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5025.eqsin.wmnet with OS trixie
* 15:34 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5026.eqsin.wmnet with OS trixie
* 15:32 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief2002.codfw.wmnet
* 15:32 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief1002.eqiad.wmnet
* 15:32 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host hcaptcha-proxy4003.wikimedia.org with OS bookworm
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:31 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) hcaptcha-proxy4003.wikimedia.org on all recursors
* 15:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache hcaptcha-proxy4003.wikimedia.org on all recursors
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test2001.codfw.wmnet
* 15:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host acmechief-test1001.eqiad.wmnet
* 15:28 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief1002.eqiad.wmnet
* 15:26 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test2001.codfw.wmnet
* 15:25 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host acmechief-test1001.eqiad.wmnet
* 15:22 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255747{{!}}Hooks: Re-apply I52fc151ab88d79754baeff35d2c0f200ebe9fc9a]] (duration: 09m 55s)
* 15:22 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM hcaptcha-proxy4003.wikimedia.org - jmm@cumin2002"
* 15:22 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:22 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:21 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:21 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:21 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:18 phuedx@deploy2002: phuedx: Continuing with sync
* 15:18 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:18 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:17 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:17 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 15:16 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 15:16 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 15:15 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 15:14 phuedx@deploy2002: phuedx: Backport for [[gerrit:1255747{{!}}Hooks: Re-apply I52fc151ab88d79754baeff35d2c0f200ebe9fc9a]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:14 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 15:14 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host hcaptcha-proxy4003.wikimedia.org
* 15:12 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1255747{{!}}Hooks: Re-apply I52fc151ab88d79754baeff35d2c0f200ebe9fc9a]]
* 15:11 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 15:10 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 15:10 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 15:09 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 15:09 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 15:09 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 15:07 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5026.eqsin.wmnet with OS trixie
* 15:06 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5025.eqsin.wmnet with OS trixie
* 15:06 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh4004.wikimedia.org
* 15:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh4004.wikimedia.org with OS bookworm
* 15:05 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk1003.eqiad.wmnet
* 15:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk1003.eqiad.wmnet
* 14:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk1002.eqiad.wmnet
* 14:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host matomo1003.eqiad.wmnet
* 14:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zookeeper-test1002.eqiad.wmnet
* 14:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk1002.eqiad.wmnet
* 14:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host flink-zk1001.eqiad.wmnet
* 14:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host matomo1003.eqiad.wmnet
* 14:52 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host zookeeper-test1002.eqiad.wmnet
* 14:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host flink-zk1001.eqiad.wmnet
* 14:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1006.eqiad.wmnet
* 14:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh4004.wikimedia.org with reason: host reimage
* 14:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-conf1006.eqiad.wmnet
* 14:43 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh4004.wikimedia.org with reason: host reimage
* 14:40 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 14:38 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 14:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1005.eqiad.wmnet
* 14:32 bking@cumin2002: conftool action : set/pooled=yes:weight=1; selector: name=dse-k8s-worker1010.eqiad.wmnet{{!}}dse-k8s-worker1011.eqiad.wmnet{{!}}dse-k8s-worker1012.eqiad.wmnet{{!}}dse-k8s-worker1013.eqiad.wmnet{{!}}dse-k8s-worker1015.eqiad.wmnet{{!}}dse-k8s-worker1016.eqiad.wmnet{{!}}dse-k8s-worker1017.eqiad.wmnet{{!}}dse-k8s-worker1018.eqiad.wmnet{{!}}dse-k8s-worker1019.eqiad.wmnet
* 14:29 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-conf1005.eqiad.wmnet
* 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-conf1004.eqiad.wmnet
* 14:25 bking@cumin2002: conftool action : set/pooled=yes:weight=10; selector: name=dse-k8s-worker1012.eqiad.wmnet{{!}}dse-k8s-worker1015.eqiad.wmnet{{!}}dse-k8s-worker1016.eqiad.wmnet{{!}}dse-k8s-worker1017.eqiad.wmnet
* 14:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-conf1004.eqiad.wmnet
* 14:21 jmm@deploy2002: helmfile [eqiad] DONE helmfile.d/services/proton: apply
* 14:20 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host doh4004.wikimedia.org with OS bookworm
* 14:20 jmm@deploy2002: helmfile [eqiad] START helmfile.d/services/proton: apply
* 14:19 jmm@deploy2002: helmfile [codfw] DONE helmfile.d/services/proton: apply
* 14:18 jmm@deploy2002: helmfile [codfw] START helmfile.d/services/proton: apply
* 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:17 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4004.wikimedia.org on all recursors
* 14:17 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4004.wikimedia.org on all recursors
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:13 jmm@deploy2002: helmfile [staging] DONE helmfile.d/services/proton: apply
* 14:12 jmm@deploy2002: helmfile [staging] START helmfile.d/services/proton: apply
* 14:11 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4004.wikimedia.org - jmm@cumin2002"
* 14:04 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:04 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4004.wikimedia.org
* 14:03 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host doh4003.wikimedia.org
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host doh4003.wikimedia.org with OS bookworm
* 13:53 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:49 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:48 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:46 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250107{{!}}Expose new wikifunctions.v0 REST API module on Wikifunctions.org only (T419053)]] (duration: 06m 03s)
* 13:42 jforrester@deploy2002: jforrester: Continuing with sync
* 13:42 jforrester@deploy2002: jforrester: Backport for [[gerrit:1250107{{!}}Expose new wikifunctions.v0 REST API module on Wikifunctions.org only (T419053)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:40 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1250107{{!}}Expose new wikifunctions.v0 REST API module on Wikifunctions.org only (T419053)]]
* 13:38 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on doh4003.wikimedia.org with reason: host reimage
* 13:33 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on doh4003.wikimedia.org with reason: host reimage
* 13:22 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255686{{!}}CreateAccount: Add class to aide in instrumentation]], [[gerrit:1255685{{!}}createAccount: Log exposure and CTRs for account creation experiment (T419916)]] (duration: 12m 58s)
* 13:22 moritzm: upgrade rpki1001 to Routinator 0.15.1 [[phab:T420572|T420572]]
* 13:15 urbanecm@deploy2002: migr, urbanecm: Continuing with sync
* 13:13 urbanecm@deploy2002: migr, urbanecm: Backport for [[gerrit:1255686{{!}}CreateAccount: Add class to aide in instrumentation]], [[gerrit:1255685{{!}}createAccount: Log exposure and CTRs for account creation experiment (T419916)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:12 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host doh4003.wikimedia.org with OS bookworm
* 13:09 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1255686{{!}}CreateAccount: Add class to aide in instrumentation]], [[gerrit:1255685{{!}}createAccount: Log exposure and CTRs for account creation experiment (T419916)]]
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4003.wikimedia.org - jmm@cumin2002"
* 13:09 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM doh4003.wikimedia.org - jmm@cumin2002"
* 13:07 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 17 hosts with reason: upgrade
* 13:01 moritzm: installing rsync security updates
* 12:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1017.eqiad.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts testvm7001.magru.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:59 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: testvm7001.magru.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 12:57 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 12:54 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 12:53 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1017.eqiad.wmnet
* 12:53 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1016.eqiad.wmnet
* 12:52 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 12:52 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 12:51 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 12:51 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 12:50 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 12:50 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 12:50 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 12:50 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 12:49 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 12:48 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 12:48 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 12:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1016.eqiad.wmnet
* 12:47 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:46 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:46 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 12:46 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:46 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 12:45 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 12:44 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 12:43 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 12:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 12:43 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host krb2002.codfw.wmnet
* 12:41 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 12:41 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts testvm7001.magru.wmnet
* 12:41 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 12:41 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 12:40 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 12:40 kamila@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 12:39 kamila@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 12:39 kamila@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 12:38 kamila@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 12:37 kamila@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 12:37 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:37 kamila@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
* 12:37 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4003.wikimedia.org
* 12:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host krb2002.codfw.wmnet
* 12:29 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:27 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:25 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:25 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:24 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:23 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:22 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 12:22 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 12:10 urbanecm@deploy2002: mwscript-k8s job started: GrowthExperiments:reassignMentees --wiki=enwiki --mentor=Bilorv --performer=Bilorv --as-job # [[phab:T418194|T418194]]
* 11:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 11:59 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:58 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on A:cephosd-codfw
* 11:57 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 11:53 moritzm: upgrade rpki2003 to Routinator 0.15.1 [[phab:T420572|T420572]]
* 11:46 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:40 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:40 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 11:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1017.eqiad.wmnet with reason: host reimage
* 11:26 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1017.eqiad.wmnet with reason: host reimage
* 11:18 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-codfw
* 11:18 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-codfw
* 11:11 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 11:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki1001.eqiad.wmnet
* 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki1001.eqiad.wmnet
* 11:05 klausman@cumin2002: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:ml-serve-worker-codfw
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host rpki2003.codfw.wmnet
* 10:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host rpki2003.codfw.wmnet
* 10:55 btullis@cumin1003: END (PASS) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5]*<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 10:54 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host doh4003.wikimedia.org
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 10:54 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2004.codfw.wmnet
* 10:51 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-codfw
* 10:50 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 10:50 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:50 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 10:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2004.codfw.wmnet
* 10:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2005.codfw.wmnet
* 10:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema2004.codfw.wmnet
* 10:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2005.codfw.wmnet
* 10:43 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:43 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4003.wikimedia.org
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4007.ulsfo.wmnet to cluster ulsfo02 and group 01
* 10:42 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4007.ulsfo.wmnet to cluster ulsfo02 and group 01
* 10:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema2004.codfw.wmnet
* 10:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema2003.codfw.wmnet
* 10:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema2003.codfw.wmnet
* 10:37 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1002.wikimedia.org
* 10:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema1004.eqiad.wmnet
* 10:36 Raine: created temporary categorylinks_icu72 tables -- [[phab:T419980|T419980]], [[phab:T419049|T419049]]
* 10:36 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 10:34 aokoth@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/services/miscweb: apply
* 10:33 aokoth@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/services/miscweb: apply
* 10:32 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema1004.eqiad.wmnet
* 10:32 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 10:31 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 10:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4007.ulsfo.wmnet
* 10:29 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on P<nowiki>{</nowiki>cephosd100[4-5]*<nowiki>}</nowiki> and (A:cephosd-codfw or A:cephosd-eqiad)
* 10:28 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:28 fnegri@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1002.wikimedia.org
* 10:26 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2006.codfw.wmnet
* 10:25 btullis@cumin1003: END (PASS) - Cookbook sre.opensearch.roll-restart-reboot (exit_code=0) rolling reboot on A:datahubsearch
* 10:24 btullis@cumin1003: END (FAIL) - Cookbook sre.ceph.roll-restart-reboot-server (exit_code=99) rolling reboot on A:cephosd-eqiad
* 10:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4007.ulsfo.wmnet
* 10:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2006.codfw.wmnet
* 10:21 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host clouddumps1001.wikimedia.org
* 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2008.wikimedia.org
* 10:19 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:18 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2008.wikimedia.org
* 10:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2007.codfw.wmnet
* 10:13 fnegri@cumin1003: START - Cookbook sre.hosts.reboot-single for host clouddumps1001.wikimedia.org
* 10:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2007.codfw.wmnet
* 10:09 btullis@cumin1003: START - Cookbook sre.opensearch.roll-restart-reboot rolling reboot on A:datahubsearch
* 10:04 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:03 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4007.ulsfo.wmnet with OS bookworm
* 09:58 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 17 hosts with reason: upgrade
* 09:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host schema1003.eqiad.wmnet
* 09:53 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host schema1003.eqiad.wmnet
* 09:46 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]] (duration: 01m 07s)
* 09:45 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]]
* 09:43 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 09:43 jnuche@deploy2002: Finished deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]] (duration: 00m 59s)
* 09:42 jnuche@deploy2002: Started deploy [releng/jenkins-deploy@863e5c2] (releasing): [[phab:T420477|T420477]]
* 09:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4007.ulsfo.wmnet with reason: host reimage
* 09:35 moritzm: installing libnginx-mod-http-lua security updates
* 09:35 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4007.ulsfo.wmnet with reason: host reimage
* 09:29 btullis@cumin1003: START - Cookbook sre.ceph.roll-restart-reboot-server rolling reboot on A:cephosd-eqiad
* 09:26 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 09:26 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 09:24 klausman@cumin2002: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-codfw
* 09:21 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:21 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:19 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:19 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 09:13 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4007.ulsfo.wmnet with OS bookworm
* 09:11 klausman@cumin2002: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-codfw
* 09:01 moritzm: remove ganeti4007 from classic Ganeti cluster in ulsfo [[phab:T418993|T418993]]
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of doh4001.wikimedia.org to plain
* 08:54 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of doh4001.wikimedia.org to plain
* 08:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of doh4002.wikimedia.org to plain
* 08:46 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of doh4002.wikimedia.org to plain
* 08:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4001.wikimedia.org to plain
* 08:45 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4001.wikimedia.org to plain
* 08:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of hcaptcha-proxy4002.wikimedia.org to plain
* 08:44 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of hcaptcha-proxy4002.wikimedia.org to plain
* 08:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.changedisk (exit_code=0) for changing disk type of install4003.wikimedia.org to plain
* 08:42 jmm@cumin2002: START - Cookbook sre.ganeti.changedisk for changing disk type of install4003.wikimedia.org to plain
* 08:40 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:39 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4007.ulsfo.wmnet
* 08:38 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host doh4003.wikimedia.org
* 08:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 08:38 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 08:38 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:38 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:38 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Remove records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) doh4003.wikimedia.org on all recursors
* 08:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache doh4003.wikimedia.org on all recursors
* 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM doh4003.wikimedia.org - jmm@cumin2002"
* 08:31 moritzm: installing python-apt security updates
* 08:29 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:29 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host doh4003.wikimedia.org
* 08:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 08:14 moritzm: installing imagemagick security updates on Bullseye
* 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1048.eqiad.wmnet
* 08:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1048.eqiad.wmnet
* 08:12 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 08:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1048.eqiad.wmnet
* 07:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1048.eqiad.wmnet
* 07:17 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 07:17 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 07:16 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 07:16 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 07:14 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/services/miscweb: apply
* 07:14 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/services/miscweb: apply
* 04:53 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 00:06 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon2003.codfw.wmnet
* 00:02 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon2003.codfw.wmnet
* 00:01 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host kafkamon1003.eqiad.wmnet
== 2026-03-18 ==
* 23:58 mutante: releases2003 - kill 782 (stunnel4) - systemctl start stunnel4 - fix [[phab:T420246|T420246]] [[phab:T420388|T420388]] [[phab:T420411|T420411]]
* 23:57 herron@cumin1003: START - Cookbook sre.hosts.reboot-single for host kafkamon1003.eqiad.wmnet
* 23:49 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:cassandra-dev
* 23:23 eevans@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:cassandra-dev
* 23:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5017.*
* 23:02 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5020.*
* 23:01 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5028.*
* 22:40 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5028.eqsin.wmnet with OS trixie
* 22:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5020.eqsin.wmnet with OS trixie
* 22:08 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 22:04 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5028.eqsin.wmnet with reason: host reimage
* 21:51 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-eqiad
* 21:49 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-reboot (exit_code=0) rolling reboot on A:dnsbox
* 21:49 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7002.wikimedia.org
* 21:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 21:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5027.*
* 21:40 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5020.eqsin.wmnet with reason: host reimage
* 21:31 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 21:30 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5028.eqsin.wmnet with OS trixie
* 21:30 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7002.wikimedia.org
* 21:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5027.eqsin.wmnet with OS trixie
* 21:27 jforrester@deploy2002: mwscript-k8s job started: extensions/WikimediaMaintenance/maintenance/addWiki.php --wiki=abstractwiki # [[phab:T411723|T411723]] addWiki.php run
* 21:26 jforrester@deploy2002: mwscript-k8s job started: extensions/WikimediaMaintenance/maintenance/addWiki.php --wiki=abstractwiki # [[phab:T411723|T411723]] addWiki.php run
* 21:24 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255034{{!}}Revert "OrchestratorRequest: Switch evaluations to v2 endpoint" (T418887)]], [[gerrit:1247650{{!}}Create Abstract Wikipedia (T411725 T411726)]] (duration: 06m 44s)
* 21:20 jforrester@deploy2002: jforrester: Continuing with sync
* 21:20 jforrester@deploy2002: jforrester: Backport for [[gerrit:1255034{{!}}Revert "OrchestratorRequest: Switch evaluations to v2 endpoint" (T418887)]], [[gerrit:1247650{{!}}Create Abstract Wikipedia (T411725 T411726)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:17 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1255034{{!}}Revert "OrchestratorRequest: Switch evaluations to v2 endpoint" (T418887)]], [[gerrit:1247650{{!}}Create Abstract Wikipedia (T411725 T411726)]]
* 21:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5017.eqsin.wmnet with OS trixie
* 21:15 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns7001.wikimedia.org
* 21:12 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS trixie
* 21:08 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1255013{{!}}Guard for JS null deref on empty Parsoid sections (T419721)]], [[gerrit:1254889{{!}}Reapply "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] (duration: 11m 20s)
* 21:07 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 21:07 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5028.eqsin.wmnet with OS trixie
* 21:04 jdlrobson@deploy2002: jdlrobson, harroyo-wmf: Continuing with sync
* 20:59 jdlrobson@deploy2002: jdlrobson, harroyo-wmf: Backport for [[gerrit:1255013{{!}}Guard for JS null deref on empty Parsoid sections (T419721)]], [[gerrit:1254889{{!}}Reapply "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:59 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-eqiad
* 20:58 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns7001.wikimedia.org
* 20:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5027.eqsin.wmnet with reason: host reimage
* 20:57 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1255013{{!}}Guard for JS null deref on empty Parsoid sections (T419721)]], [[gerrit:1254889{{!}}Reapply "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]]
* 20:52 herron@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-logging-codfw
* 20:51 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5027.eqsin.wmnet with reason: host reimage
* 20:51 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in1001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:50 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5020.eqsin.wmnet with OS trixie
* 20:50 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-in2001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:49 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 20:48 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5028.eqsin.wmnet with OS trixie
* 20:44 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5017.eqsin.wmnet with reason: host reimage
* 20:43 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns6002.wikimedia.org
* 20:42 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out1001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:42 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1033.eqiad.wmnet with OS trixie
* 20:42 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 20:42 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 20:38 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254956{{!}}Limit legacy postprocessing cache to pages where DT does apply (T376183)]] (duration: 13m 54s)
* 20:37 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5017.eqsin.wmnet with reason: host reimage
* 20:35 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on mx-out2001.wikimedia.org with reason: [[phab:T419960|T419960]]
* 20:34 cscott@deploy2002: cscott: Continuing with sync
* 20:33 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns6002.wikimedia.org
* 20:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-upload_codfw and not P<nowiki>{</nowiki>cp2042.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 20:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2058.codfw.wmnet
* 20:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on A:cp-text_codfw and not P<nowiki>{</nowiki>cp2041.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 20:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2057.codfw.wmnet
* 20:26 cscott@deploy2002: cscott: Backport for [[gerrit:1254956{{!}}Limit legacy postprocessing cache to pages where DT does apply (T376183)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:25 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1033.eqiad.wmnet with reason: host reimage
* 20:24 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1254956{{!}}Limit legacy postprocessing cache to pages where DT does apply (T376183)]]
* 20:24 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5027.eqsin.wmnet with OS trixie
* 20:22 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5029.*
* 20:21 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5029.eqsin.wmnet with OS trixie
* 20:20 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1033.eqiad.wmnet with reason: host reimage
* 20:18 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns6001.wikimedia.org
* 20:14 kemayo@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254965{{!}}Editcheck: fix tagging not happening for non-default checks]] (duration: 06m 28s)
* 20:10 kemayo@deploy2002: kemayo: Continuing with sync
* 20:10 kemayo@deploy2002: kemayo: Backport for [[gerrit:1254965{{!}}Editcheck: fix tagging not happening for non-default checks]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:09 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1033.eqiad.wmnet with OS trixie
* 20:08 kemayo@deploy2002: Started scap sync-world: Backport for [[gerrit:1254965{{!}}Editcheck: fix tagging not happening for non-default checks]]
* 20:06 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5028.eqsin.wmnet with OS trixie
* 20:05 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns6001.wikimedia.org
* 20:05 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5017.eqsin.wmnet with OS trixie
* 20:05 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5030.*
* 20:05 herron@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-logging-codfw
* 19:51 Reedy: running `foreachwikiindblist fishbowl.dblist extensions/OATHAuth/maintenance/UpdateSecretsToEncryptedFormat.php` [[phab:T404363|T404363]]
* 19:51 Reedy: running `foreachwikiindblist private.dblist extensions/OATHAuth/maintenance/UpdateSecretsToEncryptedFormat.php` [[phab:T404363|T404363]]
* 19:50 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns5004.wikimedia.org
* 19:50 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2056.codfw.wmnet
* 19:50 Reedy: running `mwscript extensions/OATHAuth/maintenance/UpdateSecretsToEncryptedFormat.php --wiki=metawiki` [[phab:T404363|T404363]]
* 19:49 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
* 19:49 reedy@deploy2002: Synchronized private/PrivateSettings.php: Set $wgOATHSecretKey [[phab:T404363|T404363]] (duration: 05m 51s)
* 19:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2055.codfw.wmnet
* 19:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5030.eqsin.wmnet with OS trixie
* 19:42 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5029.eqsin.wmnet with reason: host reimage
* 19:39 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp5017.eqsin.wmnet with OS trixie
* 19:35 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 19:35 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 19:33 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns5004.wikimedia.org
* 19:30 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host install4004.wikimedia.org with OS bookworm
* 19:29 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5020.eqsin.wmnet with OS trixie
* 19:28 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5020.eqsin.wmnet [reason: trixie reimaging]
* 19:28 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5018.eqsin.wmnet [reason: trixie reimaging]
* 19:27 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 19:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 19:26 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5018.eqsin.wmnet with OS trixie
* 19:23 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:23 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:18 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns5003.wikimedia.org
* 19:17 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:13 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5030.eqsin.wmnet with reason: host reimage
* 19:13 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on install4004.wikimedia.org with reason: host reimage
* 19:11 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2054.codfw.wmnet
* 19:08 brett@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts cp5029.eqsin.wmnet
* 19:08 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2053.codfw.wmnet
* 19:08 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5029.eqsin.wmnet with OS trixie
* 19:08 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on install4004.wikimedia.org with reason: host reimage
* 19:02 brett@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp5029.eqsin.wmnet
* 19:01 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns5003.wikimedia.org
* 18:56 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5031.*
* 18:56 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5031.eqsin.wmnet with OS trixie
* 18:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 18:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5018.eqsin.wmnet with reason: host reimage
* 18:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5030.eqsin.wmnet with OS trixie
* 18:46 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns4004.wikimedia.org
* 18:45 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host install4004.wikimedia.org with OS bookworm
* 18:45 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp5032.*
* 18:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp5032.eqsin.wmnet with OS trixie
* 18:32 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2052.codfw.wmnet
* 18:29 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2051.codfw.wmnet
* 18:27 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns4004.wikimedia.org
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
* 18:18 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5017.eqsin.wmnet with OS trixie
* 18:17 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5031.eqsin.wmnet with reason: host reimage
* 18:17 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5017.eqsin.wmnet [reason: trixie reimaging]
* 18:17 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp5018.eqsin.wmnet with OS trixie
* 18:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5018.eqsin.wmnet [reason: trixie reimaging]
* 18:16 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp5017.eqsin.wmnet [reason: trixie reimaging]
* 18:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp5017.eqsin.wmnet [reason: trixie reimaging]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3077.esams.wmnet [reason: trixie reimaging]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 18:12 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 18:12 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns4003.wikimedia.org
* 18:09 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1258: Ready
* 18:07 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp5032.eqsin.wmnet with reason: host reimage
* 18:01 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 17:59 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3078.esams.wmnet with OS trixie
* 17:56 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3077.esams.wmnet with OS trixie
* 17:55 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns4003.wikimedia.org
* 17:54 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2050.codfw.wmnet
* 17:51 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2049.codfw.wmnet
* 17:43 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:42 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:40 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns3004.wikimedia.org
* 17:39 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:38 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on backupmon1001.eqiad.wmnet with reason: upgrade
* 17:35 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1347.eqiad.wmnet with OS trixie
* 17:33 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 17:32 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 17:32 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5031.eqsin.wmnet with OS trixie
* 17:32 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5032.eqsin.wmnet with OS trixie
* 17:32 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp5031.eqsin.wmnet with OS trixie
* 17:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 17:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 17:30 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3077.esams.wmnet with reason: host reimage
* 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:29 claime: rearmed keyholder on deploy1003
* 17:28 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:27 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 17:26 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3077.esams.wmnet with reason: host reimage
* 17:25 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:25 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:23 fceratto@cumin1003: START - Cookbook sre.mysql.pool pool db1258: Ready
* 17:23 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns3004.wikimedia.org
* 17:21 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-ext: apply
* 17:21 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-ext: apply
* 17:21 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-jobrunner: apply
* 17:20 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-jobrunner: apply
* 17:20 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-esams and A:ncredir
* 17:19 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1347.eqiad.wmnet with reason: host reimage
* 17:18 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-drmrs and A:ncredir
* 17:16 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-eqiad and A:ncredir
* 17:15 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-ulsfo and A:ncredir
* 17:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2048.codfw.wmnet
* 17:15 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5032.eqsin.wmnet with OS trixie
* 17:14 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1347.eqiad.wmnet with reason: host reimage
* 17:14 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp5031.eqsin.wmnet with OS trixie
* 17:12 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 17:12 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2047.codfw.wmnet
* 17:11 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 17:09 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 17:09 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp3078.*
* 17:08 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-ext: apply
* 17:08 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-ext: apply
* 17:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3079.*
* 17:08 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns3003.wikimedia.org
* 17:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3078.*
* 17:07 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-eqiad and A:ncredir
* 17:07 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: apply
* 17:07 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-esams and A:ncredir
* 17:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: apply
* 17:06 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir2002.*
* 17:05 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-drmrs and A:ncredir
* 17:05 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir2002.codfw.wmnet
* 17:05 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-eqsin and A:ncredir
* 17:05 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-ulsfo and A:ncredir
* 17:04 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=0) rolling reboot on A:ncredir-magru and A:ncredir
* 17:03 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3078.esams.wmnet with OS trixie
* 17:02 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wikikube-worker1347
* 17:02 jayme@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wikikube-worker1347
* 17:02 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 17:01 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3076.esams.wmnet [reason: trixie reimaging]
* 17:01 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3077.esams.wmnet with OS trixie
* 17:01 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3077.esams.wmnet [reason: trixie reimaging]
* 16:59 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncredir2002.codfw.wmnet
* 16:58 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir2002.*
* 16:56 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 8 hosts with reason: upgrade
* 16:55 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir2001.*
* 16:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for ncredir2001.codfw.wmnet
* 16:55 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for ncredir2001.codfw.wmnet
* 16:55 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3076.esams.wmnet with OS trixie
* 16:53 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2014.codfw.wmnet
* 16:52 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-eqsin and A:ncredir
* 16:52 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2008.codfw.wmnet with reason: kernel update
* 16:51 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir-magru and A:ncredir
* 16:51 klausman@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on ml-serve1013.eqiad.wmnet with reason: Reboot for security update
* 16:50 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2013.codfw.wmnet
* 16:49 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir2001.*
* 16:49 brett@cumin2002: END (ERROR) - Cookbook sre.cdn.roll-restart-reboot-ncredir (exit_code=97) rolling reboot on A:ncredir and A:ncredir
* 16:48 jayme@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host wikikube-worker1347
* 16:48 jayme@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wikikube-worker1347.eqiad.wmnet 199.48.64.10.in-addr.arpa 9.9.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 16:48 jayme@cumin1003: START - Cookbook sre.dns.wipe-cache wikikube-worker1347.eqiad.wmnet 199.48.64.10.in-addr.arpa 9.9.1.0.8.4.0.0.4.6.0.0.0.1.0.0.7.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 16:48 jayme@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:47 jayme@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1347 - jayme@cumin1003"
* 16:47 jayme@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wikikube-worker1347 - jayme@cumin1003"
* 16:47 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns3003.wikimedia.org
* 16:47 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1012.eqiad.wmnet
* 16:47 brett@cumin2002: START - Cookbook sre.cdn.roll-restart-reboot-ncredir rolling reboot on A:ncredir and A:ncredir
* 16:47 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2012.codfw.wmnet
* 16:47 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2014.codfw.wmnet
* 16:46 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3075.esams.wmnet [reason: trixie reimaging]
* 16:46 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2003.codfw.wmnet
* 16:45 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet
* 16:44 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2013.codfw.wmnet
* 16:44 jayme@cumin1003: START - Cookbook sre.dns.netbox
* 16:43 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2009.codfw.wmnet
* 16:43 jayme@cumin1003: START - Cookbook sre.hosts.move-vlan for host wikikube-worker1347
* 16:43 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-cluster (exit_code=99)
* 16:43 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1347.eqiad.wmnet with OS trixie
* 16:43 brett@cumin2002: START - Cookbook sre.hosts.reboot-cluster
* 16:41 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2007.codfw.wmnet with reason: kernel update
* 16:40 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2012.codfw.wmnet
* 16:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3079.esams.wmnet with OS trixie
* 16:39 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup2008.codfw.wmnet
* 16:38 moritzm: installing PHP 8.2 security updates
* 16:37 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2009.codfw.wmnet
* 16:36 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2046.codfw.wmnet
* 16:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3078.esams.wmnet with OS trixie
* 16:34 moritzm: installing alsa-lib security updates
* 16:33 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2045.codfw.wmnet
* 16:32 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2008.codfw.wmnet
* 16:32 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns2006.wikimedia.org
* 16:29 moritzm: failover Ganeti master in eqiad to ganeti1046
* 16:29 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3076.esams.wmnet with reason: host reimage
* 16:29 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2003.codfw.wmnet
* 16:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1054.eqiad.wmnet
* 16:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1054.eqiad.wmnet
* 16:24 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy2005.codfw.wmnet with reason: kernel update
* 16:24 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3076.esams.wmnet with reason: host reimage
* 16:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1054.eqiad.wmnet
* 16:22 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1012.eqiad.wmnet
* 16:20 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1013.eqiad.wmnet
* 16:19 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1011.eqiad.wmnet
* 16:18 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1054.eqiad.wmnet
* 16:18 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 16:16 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host install4004.wikimedia.org with OS bookworm
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1053.eqiad.wmnet
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1053.eqiad.wmnet
* 16:14 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns2006.wikimedia.org
* 16:14 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1013.eqiad.wmnet
* 16:14 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1009.eqiad.wmnet
* 16:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3079.esams.wmnet with reason: host reimage
* 16:13 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1011.eqiad.wmnet
* 16:12 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1029.eqiad.wmnet with reason: kernel update
* 16:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1016.eqiad.wmnet with OS bookworm
* 16:12 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 16:11 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 16:11 moritzm: powercycling ganeti1053 (stuck on reboot)
* 16:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 16:09 klausman@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad
* 16:09 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 16:08 klausman@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad
* 16:07 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1009.eqiad.wmnet
* 16:07 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1003.eqiad.wmnet
* 16:06 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3079.esams.wmnet with reason: host reimage
* 16:06 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 16:06 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3078.esams.wmnet with reason: host reimage
* 16:04 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 16:04 klausman@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on A:ml-serve-worker-eqiad
* 16:04 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 16:02 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 16:01 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1028.eqiad.wmnet with reason: kernel update
* 16:00 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1003.eqiad.wmnet
* 16:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1015.eqiad.wmnet
* 16:00 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3075.esams.wmnet with OS trixie
* 16:00 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3076.esams.wmnet with OS trixie
* 15:59 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns2005.wikimedia.org
* 15:58 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3076.esams.wmnet [reason: trixie reimaging]
* 15:58 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1012.eqiad.wmnet
* 15:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2044.codfw.wmnet
* 15:58 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 15:57 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3078.esams.wmnet [reason: trixie reimaging]
* 15:57 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-serve1010.eqiad.wmnet
* 15:57 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1008.eqiad.wmnet
* 15:57 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3074.esams.wmnet [reason: trixie reimaging]
* 15:56 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1017.eqiad.wmnet with OS bookworm
* 15:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1053.eqiad.wmnet
* 15:55 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp2043.codfw.wmnet
* 15:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1017.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:54 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on dbproxy1023.eqiad.wmnet with reason: kernel update
* 15:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1015.eqiad.wmnet
* 15:54 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for dbproxy1022.eqiad.wmnet
* 15:54 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1008.eqiad.wmnet
* 15:54 fceratto@cumin1003: START - Cookbook sre.hosts.remove-downtime for dbproxy1022.eqiad.wmnet
* 15:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1053.eqiad.wmnet
* 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1052.eqiad.wmnet
* 15:52 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-serve1010.eqiad.wmnet
* 15:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1052.eqiad.wmnet
* 15:51 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1012.eqiad.wmnet
* 15:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3074.esams.wmnet with OS trixie
* 15:49 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-upload_codfw and not P<nowiki>{</nowiki>cp2042.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 15:48 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1014.eqiad.wmnet
* 15:48 klausman@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on A:ml-serve-worker-eqiad
* 15:47 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host deploy1003.eqiad.wmnet
* 15:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1052.eqiad.wmnet
* 15:46 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on A:cp-text_codfw and not P<nowiki>{</nowiki>cp2041.codfw.wmnet<nowiki>}</nowiki> and A:cp
* 15:45 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1017.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:45 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns2005.wikimedia.org
* 15:42 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1014.eqiad.wmnet
* 15:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1052.eqiad.wmnet
* 15:41 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3079.esams.wmnet with OS trixie
* 15:41 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3078.esams.wmnet with OS trixie
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1051.eqiad.wmnet
* 15:39 fceratto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on dbproxy1022.eqiad.wmnet with reason: kernel update
* 15:38 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Update for dse-k8s-worker1016 - btullis@cumin1003"
* 15:37 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Update for dse-k8s-worker1016 - btullis@cumin1003"
* 15:37 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host deploy1003.eqiad.wmnet
* 15:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1027.eqiad.wmnet
* 15:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1016.eqiad.wmnet with reason: host reimage
* 15:35 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-eqiad
* 15:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1051.eqiad.wmnet
* 15:34 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3075.esams.wmnet with reason: host reimage
* 15:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1372.eqiad.wmnet
* 15:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1371.eqiad.wmnet
* 15:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1370.eqiad.wmnet
* 15:31 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1027.eqiad.wmnet
* 15:30 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns2004.wikimedia.org
* 15:29 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1016.eqiad.wmnet with reason: host reimage
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1369.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1368.eqiad.wmnet
* 15:28 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1372.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1367.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1366.eqiad.wmnet
* 15:28 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1371.eqiad.wmnet
* 15:28 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1370.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1365.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1364.eqiad.wmnet
* 15:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1363.eqiad.wmnet
* 15:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1362.eqiad.wmnet
* 15:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1361.eqiad.wmnet
* 15:27 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1017
* 15:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1360.eqiad.wmnet
* 15:26 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1017
* 15:25 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3074.esams.wmnet with reason: host reimage
* 15:25 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install4004.wikimedia.org with OS bookworm
* 15:25 sukhe@dns1004: END - running authdns-update
* 15:24 sukhe@dns1004: START - running authdns-update
* 15:24 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host install4004.wikimedia.org
* 15:24 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host install4004.wikimedia.org with OS bookworm
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1369.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1368.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1367.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1366.eqiad.wmnet
* 15:23 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1365.eqiad.wmnet
* 15:23 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1364.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1363.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1362.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1361.eqiad.wmnet
* 15:22 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1360.eqiad.wmnet
* 15:20 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1349.eqiad.wmnet
* 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1050.eqiad.wmnet
* 15:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1050.eqiad.wmnet
* 15:18 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3075.esams.wmnet with reason: host reimage
* 15:18 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3074.esams.wmnet with reason: host reimage
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1348.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1346.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1344.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1345.eqiad.wmnet
* 15:17 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1343.eqiad.wmnet
* 15:16 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1342.eqiad.wmnet
* 15:16 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns2004.wikimedia.org
* 15:15 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1349.eqiad.wmnet
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1341.eqiad.wmnet
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1340.eqiad.wmnet
* 15:15 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1339.eqiad.wmnet
* 15:15 moritzm: imported jenkins 2.541.3 for bullseye/bookworm/trixie
* 15:14 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1338.eqiad.wmnet
* 15:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1016.eqiad.wmnet with OS bookworm
* 15:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1050.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1348.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1346.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1336.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1337.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1345.eqiad.wmnet
* 15:12 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1344.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1334.eqiad.wmnet
* 15:12 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1335.eqiad.wmnet
* 15:11 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1343.eqiad.wmnet
* 15:11 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1342.eqiad.wmnet
* 15:11 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1332.eqiad.wmnet
* 15:11 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1333.eqiad.wmnet
* 15:11 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 15:11 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 15:10 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1341.eqiad.wmnet
* 15:10 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1340.eqiad.wmnet
* 15:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1331.eqiad.wmnet
* 15:10 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1330.eqiad.wmnet
* 15:10 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1339.eqiad.wmnet
* 15:09 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1329.eqiad.wmnet
* 15:09 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1338.eqiad.wmnet
* 15:09 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1328.eqiad.wmnet
* 15:09 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1050.eqiad.wmnet
* 15:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1049.eqiad.wmnet
* 15:07 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1337.eqiad.wmnet
* 15:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1049.eqiad.wmnet
* 15:07 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1336.eqiad.wmnet
* 15:07 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1335.eqiad.wmnet
* 15:06 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1334.eqiad.wmnet
* 15:06 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1333.eqiad.wmnet
* 15:06 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1332.eqiad.wmnet
* 15:05 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1331.eqiad.wmnet
* 15:05 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1330.eqiad.wmnet
* 15:04 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1329.eqiad.wmnet
* 15:04 blake@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1328.eqiad.wmnet
* 15:03 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 15:02 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs1033.eqiad.wmnet with OS trixie
* 15:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1049.eqiad.wmnet
* 15:01 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns1006.wikimedia.org
* 14:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum4002.ulsfo.wmnet
* 14:58 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:58 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:57 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 14:54 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3075.esams.wmnet with OS trixie
* 14:54 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3075.esams.wmnet [reason: trixie reimaging]
* 14:54 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3074.esams.wmnet with OS trixie
* 14:53 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3074.esams.wmnet [reason: trixie reimaging]
* 14:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1047.eqiad.wmnet
* 14:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1047.eqiad.wmnet
* 14:50 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:49 slyngshede@dns1004: END - running authdns-update
* 14:48 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:48 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:48 slyngshede@dns1004: START - running authdns-update
* 14:47 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns1006.wikimedia.org
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1047.eqiad.wmnet
* 14:46 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum4002.ulsfo.wmnet
* 14:45 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-eqiad
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts durum4001.ulsfo.wmnet
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:44 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: durum4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 14:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1047.eqiad.wmnet
* 14:44 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1046.eqiad.wmnet
* 14:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1046.eqiad.wmnet
* 14:40 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:38 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1046.eqiad.wmnet
* 14:36 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts durum4001.ulsfo.wmnet
* 14:34 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host install4004.wikimedia.org with OS bookworm
* 14:33 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1046.eqiad.wmnet
* 14:32 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns1005.wikimedia.org
* 14:32 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "inline pattern and pattern equivalence - oblivian@cumin1003"
* 14:32 oblivian@cumin1003: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: inline pattern and pattern equivalence - oblivian@cumin1003
* 14:31 oblivian@cumin1003: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: inline pattern and pattern equivalence - oblivian@cumin1003
* 14:31 oblivian@cumin1003: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "inline pattern and pattern equivalence - oblivian@cumin1003"
* 14:30 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1045.eqiad.wmnet
* 14:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1045.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install4004.wikimedia.org - jmm@cumin2002"
* 14:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM install4004.wikimedia.org - jmm@cumin2002"
* 14:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1045.eqiad.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) install4004.wikimedia.org on all recursors
* 14:24 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install4004.wikimedia.org on all recursors
* 14:21 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.wipe-cache (exit_code=99) install4004.wikimedia.org on all recursors
* 14:21 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache install4004.wikimedia.org on all recursors
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:21 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install4004.wikimedia.org - jmm@cumin2002"
* 14:20 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM install4004.wikimedia.org - jmm@cumin2002"
* 14:19 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns1005.wikimedia.org
* 14:17 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254911{{!}}Restore quotation-marks in ext.wikilambda.app messages (T420456)]] (duration: 06m 32s)
* 14:17 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:17 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1373.eqiad.wmnet with OS bookworm
* 14:17 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:16 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:16 apine@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:16 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 14:16 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host install4004.wikimedia.org
* 14:15 apine@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:15 apine@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:14 apine@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:14 apine@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:14 jforrester@deploy2002: jforrester: Continuing with sync
* 14:13 apine@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:13 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:13 jforrester@deploy2002: jforrester: Backport for [[gerrit:1254911{{!}}Restore quotation-marks in ext.wikilambda.app messages (T420456)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:13 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 14:11 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1254911{{!}}Restore quotation-marks in ext.wikilambda.app messages (T420456)]]
* 14:08 apine@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:08 apine@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:07 apine@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:06 apine@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:05 XioNoX: set graceful-shutdown on EdgeUno transit sessions
* 14:05 apine@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:04 apine@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:04 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot finished rebooting dns1004.wikimedia.org
* 14:02 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1045.eqiad.wmnet
* 14:01 klausman@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:ml-serve-worker-eqiad
* 14:01 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:59 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 13:57 Msz2001: UTC afternoon backport+config window done
* 13:56 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254876{{!}}Tweak configuration of external link aggregate usage analysis (T419837)]] (duration: 06m 41s)
* 13:55 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:55 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 13:53 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1373.eqiad.wmnet with reason: host reimage
* 13:52 mszwarc@deploy2002: mszwarc: Continuing with sync
* 13:51 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1254876{{!}}Tweak configuration of external link aggregate usage analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:50 sukhe@cumin1003: cookbooks.sre.dns.roll-reboot begin reboot of dns1004.wikimedia.org
* 13:50 sukhe@cumin1003: START - Cookbook sre.dns.roll-reboot rolling reboot on A:dnsbox
* 13:49 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1254876{{!}}Tweak configuration of external link aggregate usage analysis (T419837)]]
* 13:49 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254916{{!}}Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917{{!}}Normalize external domain names in click analysis (T419837)]] (duration: 07m 23s)
* 13:46 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 13:45 mszwarc@deploy2002: mszwarc: Continuing with sync
* 13:43 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1254916{{!}}Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917{{!}}Normalize external domain names in click analysis (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:43 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2096.codfw.wmnet
* 13:41 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1033.eqiad.wmnet with OS trixie
* 13:41 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS bookworm
* 13:41 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1254916{{!}}Normalize external domain names in click analysis (T419837)]], [[gerrit:1254917{{!}}Normalize external domain names in click analysis (T419837)]]
* 13:40 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248095{{!}}filebackend: Remove outdated comment]], [[gerrit:1254216{{!}}GrowthExperiments: increase edit and thanks query limit II (T341599)]] (duration: 08m 47s)
* 13:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1044.eqiad.wmnet
* 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1044.eqiad.wmnet
* 13:39 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 13:39 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet
* 13:37 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2096.codfw.wmnet
* 13:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2095.codfw.wmnet
* 13:36 sgimeno@deploy2002: matmarex, sgimeno: Continuing with sync
* 13:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1044.eqiad.wmnet
* 13:33 sgimeno@deploy2002: matmarex, sgimeno: Backport for [[gerrit:1248095{{!}}filebackend: Remove outdated comment]], [[gerrit:1254216{{!}}GrowthExperiments: increase edit and thanks query limit II (T341599)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:31 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet
* 13:31 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1095.eqiad.wmnet
* 13:31 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1248095{{!}}filebackend: Remove outdated comment]], [[gerrit:1254216{{!}}GrowthExperiments: increase edit and thanks query limit II (T341599)]]
* 13:30 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2095.codfw.wmnet
* 13:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2094.codfw.wmnet
* {{safesubst:SAL entry|1=13:28 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254894{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254895{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254891{{!}}Revert "SpecialPreferences: Use Language Select Widget in language field" (T419895)]], [[gerrit:1254890{{!}}Revert "SpecialPreferences: Use Language Select Widget in lan}}
* 13:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 13:28 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 13:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1026.eqiad.wmnet
* 13:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1044.eqiad.wmnet
* 13:26 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 13:25 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1095.eqiad.wmnet
* 13:25 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1094.eqiad.wmnet
* 13:24 sgimeno@deploy2002: somerandomdeveloper, sgimeno: Continuing with sync
* {{safesubst:SAL entry|1=13:24 sgimeno@deploy2002: somerandomdeveloper, sgimeno: Backport for [[gerrit:1254894{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254895{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254891{{!}}Revert "SpecialPreferences: Use Language Select Widget in language field" (T419895)]], [[gerrit:1254890{{!}}Revert "SpecialPreferences: Use Language Select Widget in}}
* 13:23 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1016.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2094.codfw.wmnet
* 13:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2093.codfw.wmnet
* {{safesubst:SAL entry|1=13:22 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1254894{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254895{{!}}loggedOutWarning: dont set the schema for experiment events (T420451)]], [[gerrit:1254891{{!}}Revert "SpecialPreferences: Use Language Select Widget in language field" (T419895)]], [[gerrit:1254890{{!}}Revert "SpecialPreferences: Use Language Select Widget in lang}}
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1043.eqiad.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1043.eqiad.wmnet
* 13:21 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1026.eqiad.wmnet
* 13:20 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:16 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2093.codfw.wmnet
* 13:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2092.codfw.wmnet
* 13:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1043.eqiad.wmnet
* 13:16 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1094.eqiad.wmnet
* 13:16 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1093.eqiad.wmnet
* 13:15 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:15 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:15 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1043.eqiad.wmnet
* 13:10 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1042.eqiad.wmnet
* 13:10 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:10 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:09 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1093.eqiad.wmnet
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1042.eqiad.wmnet
* 13:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2092.codfw.wmnet
* 13:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1027.eqiad.wmnet with reason: host reimage
* 13:08 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1016
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 13:07 jclark@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:06 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1016
* 13:06 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 13:04 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1027.eqiad.wmnet with reason: host reimage
* 13:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1042.eqiad.wmnet
* 13:02 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1012.eqiad.wmnet
* 12:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1020.eqiad.wmnet with OS bookworm
* 12:58 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 12:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1042.eqiad.wmnet
* 12:56 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1015.eqiad.wmnet
* 12:55 ayounsi@dns1004: END - running authdns-update
* 12:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1012.eqiad.wmnet
* 12:54 ayounsi@dns1004: START - running authdns-update
* 12:53 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1012.eqiad.wmnet
* 12:53 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 12:51 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1015.eqiad.wmnet
* 12:50 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 12:50 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1041.eqiad.wmnet
* 12:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1041.eqiad.wmnet
* 12:47 cgoubert@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-main-codfw
* 12:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1041.eqiad.wmnet
* 12:42 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-jumbo-eqiad
* 12:38 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:37 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:37 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:36 ayounsi@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 12:35 blake@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1372].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 12:33 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 12:32 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:31 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:30 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1020.eqiad.wmnet with reason: host reimage
* 12:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1041.eqiad.wmnet
* 12:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1040.eqiad.wmnet
* 12:25 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1020.eqiad.wmnet with reason: host reimage
* 12:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1040.eqiad.wmnet
* 12:25 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 12:25 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 12:25 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:24 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Update for dse-k8s-worker1015 - btullis@cumin1003"
* 12:24 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Update for dse-k8s-worker1015 - btullis@cumin1003"
* 12:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 12:22 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:21 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1040.eqiad.wmnet
* 12:19 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:19 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host serpens.wikimedia.org
* 12:13 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254851{{!}}Enable autodemotion for 2FA-less CN admins and WMF T&S (T418580)]] (duration: 06m 21s)
* 12:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host serpens.wikimedia.org
* 12:10 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1040.eqiad.wmnet
* 12:10 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:10 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-eqiad
* 12:09 mszwarc@deploy2002: mszwarc: Continuing with sync
* 12:09 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1254851{{!}}Enable autodemotion for 2FA-less CN admins and WMF T&S (T418580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:07 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1254851{{!}}Enable autodemotion for 2FA-less CN admins and WMF T&S (T418580)]]
* 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1039.eqiad.wmnet
* 12:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1039.eqiad.wmnet
* 12:05 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:05 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 12:04 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:03 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:02 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254883{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254884{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]], [[gerrit:1254882{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254881{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]] (duration: 06m 48s)
* 12:02 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1026.eqiad.wmnet with reason: host reimage
* 12:02 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-eqiad
* 12:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1039.eqiad.wmnet
* 12:01 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1039.eqiad.wmnet
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1038.eqiad.wmnet
* 12:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1038.eqiad.wmnet
* 11:59 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1026.eqiad.wmnet with reason: host reimage
* 11:58 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 11:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1012.eqiad.wmnet
* 11:57 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1254883{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254884{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]], [[gerrit:1254882{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254881{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]] synced to the testservers (see https://wikitech.wikimedia.
* 11:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 11:56 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1328-1372].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 11:56 cgoubert@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-main-codfw
* 11:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:55 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1254883{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254884{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]], [[gerrit:1254882{{!}}Make it follow thumb steps (T402792 T414805)]], [[gerrit:1254881{{!}}DjvuHandler: Make it follow thumb steps (T402792 T414805 T416620 T418178)]]
* 11:54 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1027.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:54 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 11:51 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1038.eqiad.wmnet
* 11:50 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:50 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Updating for dse-k8s-worker1012 - btullis@cumin1003"
* 11:49 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Updating for dse-k8s-worker1012 - btullis@cumin1003"
* 11:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1015.eqiad.wmnet with reason: host reimage
* 11:48 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1038.eqiad.wmnet
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1037.eqiad.wmnet
* 11:48 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1037.eqiad.wmnet
* 11:48 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1307.eqiad.wmnet
* 11:48 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1307.eqiad.wmnet
* 11:47 claime: sudo homer lsw1-e5-eqiad* commit 'wikikube-worker1307 to active'
* 11:47 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:46 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1015.eqiad.wmnet with reason: host reimage
* 11:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1027.eqiad.wmnet with OS bookworm
* 11:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 11:44 cgoubert@cumin1003: START - Cookbook sre.dns.netbox
* 11:42 cgoubert@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 11:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard1003.eqiad.wmnet
* 11:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1037.eqiad.wmnet
* 11:39 cgoubert@cumin1003: START - Cookbook sre.dns.netbox
* 11:37 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1037.eqiad.wmnet
* 11:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2091.codfw.wmnet
* 11:37 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1036.eqiad.wmnet
* 11:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1036.eqiad.wmnet
* 11:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard1003.eqiad.wmnet
* 11:36 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-worker1347.eqiad.wmnet
* 11:34 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1020.eqiad.wmnet with OS bookworm
* 11:31 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 11:31 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2091.codfw.wmnet
* 11:30 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1092.eqiad.wmnet
* 11:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2090.codfw.wmnet
* 11:30 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-worker1347.eqiad.wmnet
* 11:30 jgiannelos@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mobileapps: apply
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host puppetboard2003.codfw.wmnet
* 11:30 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 11:30 jgiannelos@deploy2002: helmfile [eqiad] START helmfile.d/services/mobileapps: apply
* 11:29 jgiannelos@deploy2002: helmfile [codfw] DONE helmfile.d/services/mobileapps: apply
* 11:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1036.eqiad.wmnet
* 11:29 jgiannelos@deploy2002: helmfile [codfw] START helmfile.d/services/mobileapps: apply
* 11:28 jgiannelos@deploy2002: helmfile [staging] DONE helmfile.d/services/mobileapps: apply
* 11:28 jgiannelos@deploy2002: helmfile [staging] START helmfile.d/services/mobileapps: apply
* 11:26 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host puppetboard2003.codfw.wmnet
* 11:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1036.eqiad.wmnet
* 11:24 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2090.codfw.wmnet
* 11:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2089.codfw.wmnet
* 11:23 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1092.eqiad.wmnet
* 11:23 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1091.eqiad.wmnet
* 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1035.eqiad.wmnet
* 11:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1035.eqiad.wmnet
* 11:20 btullis@cumin1003: END (FAIL) - Cookbook sre.network.configure-switch-interfaces (exit_code=99) for host dse-k8s-worker1015
* 11:20 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1015
* 11:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1091.eqiad.wmnet
* 11:18 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1090.eqiad.wmnet
* 11:18 vgutierrez@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 11:18 vgutierrez@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 11:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping2004.codfw.wmnet
* 11:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2089.codfw.wmnet
* 11:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2088.codfw.wmnet
* 11:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1035.eqiad.wmnet
* 11:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping2004.codfw.wmnet
* 11:13 vgutierrez@dns1004: END - running authdns-update
* 11:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1035.eqiad.wmnet
* 11:11 vgutierrez@dns1004: START - running authdns-update
* 11:11 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1090.eqiad.wmnet
* 11:11 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1089.eqiad.wmnet
* 11:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2088.codfw.wmnet
* 11:10 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2087.codfw.wmnet
* 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1034.eqiad.wmnet
* 11:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1034.eqiad.wmnet
* 11:07 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1026.eqiad.wmnet with OS bookworm
* 11:05 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:05 btullis@cumin1003: END (PASS) - Cookbook sre.hadoop.reboot-workers (exit_code=0) for Hadoop test cluster
* 11:04 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1089.eqiad.wmnet
* 11:04 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1026.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 11:04 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1088.eqiad.wmnet
* 11:03 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2087.codfw.wmnet
* 11:03 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2086.codfw.wmnet
* 11:03 vgutierrez@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1034.eqiad.wmnet
* 11:00 vgutierrez@cumin1003: START - Cookbook sre.dns.netbox
* 10:59 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-jumbo-eqiad
* 10:58 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1034.eqiad.wmnet
* 10:57 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1088.eqiad.wmnet
* 10:57 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1087.eqiad.wmnet
* 10:57 fabfur@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 10:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 10:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1033.eqiad.wmnet
* 10:56 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2086.codfw.wmnet
* 10:56 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2085.codfw.wmnet
* 10:56 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 10:53 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1087.eqiad.wmnet
* 10:53 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1086.eqiad.wmnet
* 10:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1033.eqiad.wmnet
* 10:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 10:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1032.eqiad.wmnet
* 10:49 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2085.codfw.wmnet
* 10:49 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1032.eqiad.wmnet
* 10:49 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2084.codfw.wmnet
* 10:46 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1086.eqiad.wmnet
* 10:46 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1085.eqiad.wmnet
* 10:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1032.eqiad.wmnet
* 10:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1032.eqiad.wmnet
* 10:40 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2084.codfw.wmnet
* 10:40 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1031.eqiad.wmnet
* 10:40 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2083.codfw.wmnet
* 10:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1031.eqiad.wmnet
* 10:39 fabfur@cumin1003: START - Cookbook sre.dns.netbox
* 10:39 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1085.eqiad.wmnet
* 10:39 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1084.eqiad.wmnet
* 10:37 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2001.codfw.wmnet
* 10:34 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:34 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1031.eqiad.wmnet
* 10:32 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1084.eqiad.wmnet
* 10:32 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1083.eqiad.wmnet
* 10:32 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2083.codfw.wmnet
* 10:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1031.eqiad.wmnet
* 10:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2082.codfw.wmnet
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1030.eqiad.wmnet
* 10:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1030.eqiad.wmnet
* 10:32 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-staging2001.codfw.wmnet
* 10:31 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2002.codfw.wmnet
* 10:26 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-staging2002.codfw.wmnet
* 10:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1030.eqiad.wmnet
* 10:25 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1083.eqiad.wmnet
* 10:25 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1082.eqiad.wmnet
* 10:25 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2082.codfw.wmnet
* 10:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2081.codfw.wmnet
* 10:24 btullis@cumin1003: START - Cookbook sre.hadoop.reboot-workers for Hadoop test cluster
* 10:23 klausman@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ml-staging2003.codfw.wmnet
* 10:23 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1030.eqiad.wmnet
* 10:23 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1029.eqiad.wmnet
* 10:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1029.eqiad.wmnet
* 10:19 klausman@cumin1003: START - Cookbook sre.hosts.reboot-single for host ml-staging2003.codfw.wmnet
* 10:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1082.eqiad.wmnet
* 10:17 vgutierrez@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: no reason specified, no task ID specified]
* 10:17 vgutierrez@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: no reason specified, no task ID specified]
* 10:17 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2081.codfw.wmnet
* 10:17 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2001.codfw.wmnet
* 10:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1029.eqiad.wmnet
* 10:15 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1029.eqiad.wmnet
* 10:14 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2001.codfw.wmnet
* 10:14 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-staging-ctrl2002.codfw.wmnet
* 10:13 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1081.eqiad.wmnet
* 10:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1028.eqiad.wmnet
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1028.eqiad.wmnet
* 10:11 vgutierrez@dns1004: END - running authdns-update
* 10:10 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-staging-ctrl2002.codfw.wmnet
* 10:10 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2001.codfw.wmnet
* 10:09 vgutierrez@dns1004: START - running authdns-update
* 10:07 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2080.codfw.wmnet
* 10:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1028.eqiad.wmnet
* 10:06 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1081.eqiad.wmnet
* 10:06 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1080.eqiad.wmnet
* 10:05 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2001.codfw.wmnet
* 10:05 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl2002.codfw.wmnet
* 10:04 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: pool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 10:04 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 10:03 slyngshede@cumin1003: END (FAIL) - Cookbook sre.dns.admin (exit_code=99) DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 10:03 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: pool ulsfo [reason: no reason specified, no task ID specified]
* 10:01 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl2002.codfw.wmnet
* 10:01 slyngshede@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for 23 hosts
* 10:01 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1028.eqiad.wmnet
* 10:01 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1002.eqiad.wmnet
* 10:01 slyngshede@cumin1003: START - Cookbook sre.hosts.remove-downtime for 23 hosts
* 09:59 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1080.eqiad.wmnet
* 09:59 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1079.eqiad.wmnet
* 09:59 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2080.codfw.wmnet
* 09:59 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2079.codfw.wmnet
* 09:58 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1002.eqiad.wmnet
* 09:57 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:56 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1027.eqiad.wmnet
* 09:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1027.eqiad.wmnet
* 09:52 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-serve-ctrl1001.eqiad.wmnet
* 09:51 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2079.codfw.wmnet
* 09:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2078.codfw.wmnet
* 09:51 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1079.eqiad.wmnet
* 09:51 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1003.eqiad.wmnet
* 09:51 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1078.eqiad.wmnet
* 09:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb1003.eqiad.wmnet
* 09:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1027.eqiad.wmnet
* 09:48 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1003.eqiad.wmnet
* 09:48 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1002.eqiad.wmnet
* 09:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb1003.eqiad.wmnet
* 09:46 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1027.eqiad.wmnet
* 09:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1026.eqiad.wmnet
* 09:46 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1026.eqiad.wmnet
* 09:46 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1002.eqiad.wmnet
* 09:46 klausman@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM ml-etcd1001.eqiad.wmnet
* 09:45 moritzm: installing postgresql-15 security updates
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.upgrade (exit_code=0) restart A:lvs-secondary-ulsfo and A:liberica
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) pooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2078.codfw.wmnet
* 09:45 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2077.codfw.wmnet
* 09:45 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin pooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:45 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) depooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin depooling P<nowiki>{</nowiki>lvs4010.ulsfo.wmnet<nowiki>}</nowiki> and A:liberica
* 09:44 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netboxdb2003.codfw.wmnet
* 09:44 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.upgrade restart A:lvs-secondary-ulsfo and A:liberica
* 09:44 jayme: switched wikikube staging apiservers to IPIP and maglev in eqiad and codfw - [[phab:T352956|T352956]]
* 09:43 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1078.eqiad.wmnet
* 09:43 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1077.eqiad.wmnet
* 09:42 klausman@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM ml-etcd1001.eqiad.wmnet
* 09:40 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-staging-master-eqiad@eqiad
* 09:40 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:40 slyngshede@cumin1003: END (PASS) - Cookbook sre.loadbalancer.admin (exit_code=0) config_reloading A:lvs-secondary-ulsfo and A:liberica ([[phab:T418971|T418971]])
* 09:40 slyngshede@cumin1003: START - Cookbook sre.loadbalancer.admin config_reloading A:lvs-secondary-ulsfo and A:liberica ([[phab:T418971|T418971]])
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netboxdb2003.codfw.wmnet
* 09:40 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1026.eqiad.wmnet
* 09:39 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-eqiad or A:lvs-secondary-eqiad) and A:bullseye and A:lvs
* 09:38 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2077.codfw.wmnet
* 09:37 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2076.codfw.wmnet
* 09:37 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-codfw
* 09:37 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-staging-master-eqiad@eqiad
* 09:36 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1077.eqiad.wmnet
* 09:36 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1076.eqiad.wmnet
* 09:35 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.migrate-service-ipip (exit_code=0) for alias: wikikube-staging-master-codfw@codfw
* 09:35 jayme@cumin1003: END (PASS) - Cookbook sre.loadbalancer.restart-pybal (exit_code=0) rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1026.eqiad.wmnet
* 09:30 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2076.codfw.wmnet
* 09:30 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2075.codfw.wmnet
* 09:26 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1076.eqiad.wmnet
* 09:26 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1075.eqiad.wmnet
* 09:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1025.eqiad.wmnet
* 09:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1025.eqiad.wmnet
* 09:23 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2075.codfw.wmnet
* 09:23 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2074.codfw.wmnet
* 09:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netbox-dev2003.codfw.wmnet
* 09:19 jayme@cumin1003: START - Cookbook sre.loadbalancer.restart-pybal rolling-restart of pybal on (A:lvs-low-traffic-codfw or A:lvs-secondary-codfw) and A:bullseye and A:lvs
* 09:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1075.eqiad.wmnet
* 09:18 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1074.eqiad.wmnet
* 09:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netbox-dev2003.codfw.wmnet
* 09:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1025.eqiad.wmnet
* 09:15 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2074.codfw.wmnet
* 09:15 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2073.codfw.wmnet
* 09:14 jayme@cumin1003: START - Cookbook sre.loadbalancer.migrate-service-ipip for alias: wikikube-staging-master-codfw@codfw
* 09:14 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1025.eqiad.wmnet
* 09:13 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-codfw
* 09:12 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1024.eqiad.wmnet
* 09:12 klausman@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-reboot (exit_code=0) rolling reboot on A:ml-cache-eqiad
* 09:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1024.eqiad.wmnet
* 09:10 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1074.eqiad.wmnet
* 09:10 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1073.eqiad.wmnet
* 09:08 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2073.codfw.wmnet
* 09:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2072.codfw.wmnet
* 09:08 slyngshede@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 23 hosts with reason: Update ULSFO LVS service IPs
* 09:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1024.eqiad.wmnet
* 09:03 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1073.eqiad.wmnet
* 09:03 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1072.eqiad.wmnet
* 09:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2072.codfw.wmnet
* 09:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2071.codfw.wmnet
* 09:02 slyngshede@cumin1003: END (PASS) - Cookbook sre.dns.admin (exit_code=0) DNS admin: depool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 09:02 slyngshede@cumin1003: START - Cookbook sre.dns.admin DNS admin: depool ulsfo [reason: no reason specified, [[phab:T418971|T418971]]]
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ping1004.eqiad.wmnet
* 09:00 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1024.eqiad.wmnet
* 08:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1023.eqiad.wmnet
* 08:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti1023.eqiad.wmnet
* 08:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ping1004.eqiad.wmnet
* 08:56 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1072.eqiad.wmnet
* 08:56 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet
* 08:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2071.codfw.wmnet
* 08:55 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2070.codfw.wmnet
* 08:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti1023.eqiad.wmnet
* 08:50 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1023.eqiad.wmnet
* 08:48 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet
* 08:47 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2070.codfw.wmnet
* 08:46 klausman@cumin1003: START - Cookbook sre.cassandra.roll-reboot rolling reboot on A:ml-cache-eqiad
* 08:29 hashar: Restarting CI Jenkins for plugin upgrade # [[phab:T420347|T420347]]
* 08:22 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 07:45 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'configure' for AS: 32934
* 07:42 btullis@cumin1003: END (PASS) - Cookbook sre.hadoop.reboot-workers (exit_code=0) for Hadoop analytics cluster
* 07:35 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'configure' for AS: 32934
* 07:22 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
* 07:16 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
* 06:54 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/machinetranslation: apply
* 06:38 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/machinetranslation: apply
* 03:22 musikanimal@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254468{{!}}CM5: add more aggressive warnings about CM5 deprecation (T373720)]] (duration: 12m 22s)
* 03:18 musikanimal@deploy2002: musikanimal: Continuing with sync
* 03:11 musikanimal@deploy2002: musikanimal: Backport for [[gerrit:1254468{{!}}CM5: add more aggressive warnings about CM5 deprecation (T373720)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 03:09 musikanimal@deploy2002: Started scap sync-world: Backport for [[gerrit:1254468{{!}}CM5: add more aggressive warnings about CM5 deprecation (T373720)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 47s)
* 02:07 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:07 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 02:06 dani@deploy1003: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 02:05 dani@deploy1003: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 02:04 dani@deploy1003: helmfile [staging] START helmfile.d/services/miscweb: apply
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:38 denisse@deploy2002: Finished deploy [librenms/librenms@9bdfb73]: Upgrade LibreNMS to 26.3.1 (duration: 00m 19s)
* 01:38 denisse@deploy2002: Started deploy [librenms/librenms@9bdfb73]: Upgrade LibreNMS to 26.3.1
* 01:10 denisse@deploy2002: Finished deploy [librenms/librenms@d152b36]: Upgrade LibreNMS to 25.11.0 (duration: 00m 08s)
* 01:10 denisse@deploy2002: Started deploy [librenms/librenms@d152b36]: Upgrade LibreNMS to 25.11.0
== 2026-03-17 ==
* 23:44 btullis@cumin1003: START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster
* 23:38 btullis@cumin1003: END (FAIL) - Cookbook sre.hadoop.reboot-workers (exit_code=99) for Hadoop analytics cluster
* 22:55 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3081.*
* 22:20 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3073.esams.wmnet [reason: trixie reimaging]
* 22:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3073.esams.wmnet with OS trixie
* 22:11 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3072.esams.wmnet [reason: trixie reimaging]
* 22:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3072.esams.wmnet with OS trixie
* 22:05 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on releases1003.eqiad.wmnet with reason: [[phab:T420246|T420246]]
* 22:05 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on releases2003.codfw.wmnet with reason: [[phab:T420246|T420246]]
* 21:48 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3073.esams.wmnet with reason: host reimage
* 21:44 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3072.esams.wmnet with reason: host reimage
* 21:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3073.esams.wmnet with reason: host reimage
* 21:39 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3072.esams.wmnet with reason: host reimage
* 21:38 ryankemper: [[phab:T411568|T411568]] Failed back HDFS NameNode from an-master1004 to an-master1003; cluster back to original active/standby configuration
* 21:15 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3073.esams.wmnet with OS trixie
* 21:14 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3073.esams.wmnet [reason: trixie reimaging]
* 21:14 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3072.esams.wmnet with OS trixie
* 21:14 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3072.esams.wmnet [reason: trixie reimaging]
* 21:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3070.esams.wmnet [reason: trixie reimaging]
* 21:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3071.esams.wmnet [reason: trixie reimaging]
* 21:09 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3070.esams.wmnet with OS trixie
* 21:05 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3071.esams.wmnet with OS trixie
* 20:59 alexsanford@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254280{{!}}Remove notice from login form in popup mode (T418534)]] (duration: 07m 32s)
* 20:56 alexsanford@deploy2002: alexsanford: Continuing with sync
* 20:54 alexsanford@deploy2002: alexsanford: Backport for [[gerrit:1254280{{!}}Remove notice from login form in popup mode (T418534)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:52 alexsanford@deploy2002: Started scap sync-world: Backport for [[gerrit:1254280{{!}}Remove notice from login form in popup mode (T418534)]]
* 20:48 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:43 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:43 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3070.esams.wmnet with reason: host reimage
* 20:40 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 20:40 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 20:38 ryankemper: [[phab:T411568|T411568]] failed over HDFS NameNode from an-master1003 to an-master1004, then rebooted `an-master1003`
* 20:38 ryankemper: [[phab:T411568|T411568]] rebooted `an-coord1003`, `an-coord1004`, `an-tool1007`, `an-tool1008`, `an-tool1011`, `an-web1001`
* 20:38 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3071.esams.wmnet with reason: host reimage
* 20:34 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3070.esams.wmnet with reason: host reimage
* 20:34 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3071.esams.wmnet with reason: host reimage
* 20:31 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254301{{!}}Passwordless login: Don't display conditional auth errors]], [[gerrit:1254302{{!}}Passwordless login: Don't display conditional auth errors]] (duration: 08m 56s)
* 20:27 catrope@deploy2002: catrope: Continuing with sync
* 20:24 catrope@deploy2002: catrope: Backport for [[gerrit:1254301{{!}}Passwordless login: Don't display conditional auth errors]], [[gerrit:1254302{{!}}Passwordless login: Don't display conditional auth errors]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:22 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1254301{{!}}Passwordless login: Don't display conditional auth errors]], [[gerrit:1254302{{!}}Passwordless login: Don't display conditional auth errors]]
* 20:16 ryankemper: [[phab:T411568|T411568]] rebooted `an-test-master1002`, `an-test-master1003`, `an-test-master1004`, `archiva1002`
* 20:12 aude@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251309{{!}}Set wgReadingListsBetaDefaultForNewAccountsAfter for beta cluster (T419163)]] (duration: 08m 53s)
* 20:09 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3071.esams.wmnet with OS trixie
* 20:09 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3070.esams.wmnet with OS trixie
* 20:08 aude@deploy2002: aude: Continuing with sync
* 20:08 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3070.esams.wmnet [reason: trixie reimaging]
* 20:08 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3068.esams.wmnet [reason: trixie reimaging]
* 20:07 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3069.esams.wmnet [reason: trixie reimaging]
* 20:06 aude@deploy2002: aude: Backport for [[gerrit:1251309{{!}}Set wgReadingListsBetaDefaultForNewAccountsAfter for beta cluster (T419163)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:03 aude@deploy2002: Started scap sync-world: Backport for [[gerrit:1251309{{!}}Set wgReadingListsBetaDefaultForNewAccountsAfter for beta cluster (T419163)]]
* 19:58 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3081.esams.wmnet with OS trixie
* 19:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3069.esams.wmnet with OS trixie
* 19:54 ryankemper: [[phab:T411568|T411568]] rebooted `an-test-client1002`, `an-test-ui1001`, `an-test-coord1001`, `an-test-master1001`
* 19:50 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3068.esams.wmnet with OS trixie
* 19:46 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 19:39 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast2003.wikimedia.org with OS trixie
* 19:32 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3081.esams.wmnet with reason: host reimage
* 19:28 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3081.esams.wmnet with reason: host reimage
* 19:28 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on releases1003.eqiad.wmnet with reason: [[phab:T420246|T420246]]
* 19:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3069.esams.wmnet with reason: host reimage
* 19:23 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3068.esams.wmnet with reason: host reimage
* 19:21 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3069.esams.wmnet with reason: host reimage
* 19:20 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3068.esams.wmnet with reason: host reimage
* 19:19 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 19:11 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 19:08 dzahn@dns1004: END - running authdns-update
* 19:07 dzahn@dns1004: START - running authdns-update
* 19:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3081.esams.wmnet with OS trixie
* 19:05 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp3081.esams.wmnet with OS trixie
* 19:00 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp3080.*
* 18:56 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3069.esams.wmnet with OS trixie
* 18:55 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3069.esams.wmnet [reason: trixie reimaging]
* 18:55 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3068.esams.wmnet with OS trixie
* 18:55 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid public cluster: Reboot Druid nodes
* 18:54 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3068.esams.wmnet [reason: trixie reimaging]
* 18:53 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3066.esams.wmnet [reason: trixie reimaging]
* 18:53 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp3067.esams.wmnet [reason: trixie reimaging]
* 18:50 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host bast2003.wikimedia.org with OS trixie
* 18:49 swfrench-wmf: manually uncordoned wikikube-worker-exp1001.eqiad.wmnet after failed reboot
* 18:47 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3080.esams.wmnet with OS trixie
* 18:43 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3067.esams.wmnet with OS trixie
* 18:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp3066.esams.wmnet with OS trixie
* 18:32 dwisehaupt@dns1005: END - running authdns-update
* 18:31 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host bast2003.wikimedia.org with OS bookworm
* 18:31 dwisehaupt@dns1005: START - running authdns-update
* 18:25 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 18:25 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 18:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3080.esams.wmnet with reason: host reimage
* 18:19 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[5-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 18:19 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7008.magru.wmnet
* 18:17 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:16 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 18:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3067.esams.wmnet with reason: host reimage
* 18:16 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 18:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp3066.esams.wmnet with reason: host reimage
* 18:09 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3080.esams.wmnet with reason: host reimage
* 18:09 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 18:04 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3067.esams.wmnet with reason: host reimage
* 18:03 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp3066.esams.wmnet with reason: host reimage
* 18:03 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on bast2003.wikimedia.org with reason: host reimage
* 17:52 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1312-1327].eqiad.wmnet,wikikube-worker-exp1001.eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 17:52 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 17:52 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 17:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3080.esams.wmnet with OS trixie
* 17:44 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:43 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:42 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3081.esams.wmnet with OS trixie
* 17:42 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp3081.esams.wmnet with OS trixie
* 17:42 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:41 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid public cluster: Reboot Druid nodes
* 17:40 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:39 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp7013.magru.wmnet,cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 17:39 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7016.magru.wmnet
* 17:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:37 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7007.magru.wmnet
* 17:37 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:33 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:31 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:29 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3067.esams.wmnet with OS trixie
* 17:29 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3067.esams.wmnet [reason: trixie reimaging]
* 17:28 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3066.esams.wmnet with OS trixie
* 17:28 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:27 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp3066.esams.wmnet with OS trixie
* 17:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp3066.esams.wmnet with OS trixie
* 17:26 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:26 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp3066.esams.wmnet [reason: trixie reimaging]
* 17:21 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:20 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:19 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp3081.esams.wmnet with OS trixie
* 17:19 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:16 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1071.eqiad.wmnet
* 17:16 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:16 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2069.codfw.wmnet
* 17:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 17:14 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:14 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:14 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:13 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:13 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:10 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:09 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7014.*
* 17:09 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1071.eqiad.wmnet
* 17:09 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2069.codfw.wmnet
* 17:09 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:08 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1070.eqiad.wmnet
* 17:08 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2068.codfw.wmnet
* 17:07 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:06 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:06 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host bast2003.wikimedia.org with OS bookworm
* 17:06 cgoubert@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1312-1327].eqiad.wmnet,wikikube-worker-exp1001.eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 17:05 jhancock@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ['bast2003']
* 17:02 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2068.codfw.wmnet
* 17:02 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1070.eqiad.wmnet
* 17:02 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2067.codfw.wmnet
* 17:01 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1069.eqiad.wmnet
* 17:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 17:00 cgoubert@cumin1003: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 16:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7014.magru.wmnet with OS trixie
* 16:58 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 16:58 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 16:58 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-mirror-maker (exit_code=0) restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 16:57 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7015.magru.wmnet
* 16:56 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7006.magru.wmnet
* 16:55 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1069.eqiad.wmnet
* 16:55 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2067.codfw.wmnet
* 16:53 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1068.eqiad.wmnet
* 16:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2066.codfw.wmnet
* 16:47 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 16:47 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist all cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 16:46 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 16:46 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['bast2003']
* 16:45 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1068.eqiad.wmnet
* 16:45 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2066.codfw.wmnet
* 16:44 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1067.eqiad.wmnet
* 16:44 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2065.codfw.wmnet
* 16:42 btullis@cumin1003: END (PASS) - Cookbook sre.druid.reboot-workers (exit_code=0) for Druid analytics cluster: Reboot Druid nodes
* 16:40 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-mirror-maker restart MirrorMaker for Kafka A:kafka-mirror-maker-jumbo-eqiad cluster: Roll restart of jvm daemons.
* 16:37 cgoubert@cumin1003: START - Cookbook sre.dns.netbox
* 16:36 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1067.eqiad.wmnet
* 16:36 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2065.codfw.wmnet
* 16:36 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2064.codfw.wmnet
* 16:35 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1066.eqiad.wmnet
* 16:34 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist group2 cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 16:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul2003.codfw.wmnet with OS trixie
* 16:33 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7014.magru.wmnet with reason: host reimage
* 16:32 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:32 cgoubert@cumin1003: START - Cookbook sre.hosts.remove-downtime for wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:28 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7014.magru.wmnet with reason: host reimage
* 16:28 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1066.eqiad.wmnet
* 16:28 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2064.codfw.wmnet
* 16:25 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on releases2003.codfw.wmnet with reason: [[phab:T420246|T420246]]
* 16:25 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2063.codfw.wmnet
* 16:25 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1065.eqiad.wmnet
* 16:25 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:25 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1306,1308-1311].eqiad.wmnet
* 16:18 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1065.eqiad.wmnet
* 16:18 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2063.codfw.wmnet
* 16:18 btullis@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-jumbo-eqiad
* 16:17 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be2062.codfw.wmnet
* 16:17 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1064.eqiad.wmnet
* 16:15 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7013.magru.wmnet
* 16:15 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul2003.codfw.wmnet with reason: host reimage
* 16:14 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7005.magru.wmnet
* 16:10 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be2062.codfw.wmnet
* 16:08 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul2003.codfw.wmnet with reason: host reimage
* 16:07 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1064.eqiad.wmnet
* 16:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7014.magru.wmnet with OS trixie
* 16:05 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp7014.magru.wmnet with OS trixie
* 16:03 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp7013.magru.wmnet,cp701[5-6].magru.wmnet<nowiki>}</nowiki> and A:cp
* 16:03 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[5-8].magru.wmnet<nowiki>}</nowiki> and A:cp
* 15:54 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1009.eqiad.wmnet
* 15:54 mutante: zuul2003 - reimaging with trixie
* 15:52 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist group1 cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2033.codfw.wmnet
* 15:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2033.codfw.wmnet
* 15:46 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul2003.codfw.wmnet with OS trixie
* 15:45 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1009.eqiad.wmnet
* 15:45 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1008.eqiad.wmnet
* 15:44 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist group0 cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 15:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2033.codfw.wmnet
* 15:38 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 15:38 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2048.codfw.wmnet
* 15:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2048.codfw.wmnet
* 15:36 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1008.eqiad.wmnet
* 15:36 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1007.eqiad.wmnet
* 15:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2048.codfw.wmnet
* 15:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1012.eqiad.wmnet with reason: host reimage
* 15:33 samtar@deploy2002: mwscript-k8s job started: foreachwikiindblist testwikis cleanupWatchlistLabelMember.php # [[phab:T420328|T420328]]
* 15:32 btullis@cumin1003: START - Cookbook sre.druid.reboot-workers for Druid analytics cluster: Reboot Druid nodes
* 15:28 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1007.eqiad.wmnet
* 15:28 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1006.eqiad.wmnet
* 15:27 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1012.eqiad.wmnet with reason: host reimage
* 15:27 samtar@deploy2002: mwscript-k8s job started: cleanupWatchlistLabelMember.php --wiki=testwiki # [[phab:T420328|T420328]]
* 15:27 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2008-dev.codfw.wmnet
* 15:25 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1015.eqiad.wmnet with OS bookworm
* 15:23 jmm@deploy2002: helmfile [eqiad] DONE helmfile.d/services/thumbor: apply
* 15:22 btullis@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-jumbo-eqiad
* 15:21 jmm@deploy2002: helmfile [eqiad] START helmfile.d/services/thumbor: apply
* 15:20 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2008-dev.codfw.wmnet
* 15:20 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1006.eqiad.wmnet
* 15:20 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be1005.eqiad.wmnet
* 15:18 jmm@deploy2002: helmfile [codfw] DONE helmfile.d/services/thumbor: apply
* 15:18 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244723{{!}}cleanup: Growth: Remove temporary GrowthMentorList overrides (T418518)]] (duration: 06m 32s)
* 15:16 jmm@deploy2002: helmfile [codfw] START helmfile.d/services/thumbor: apply
* 15:16 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 16509
* 15:14 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2005-dev.codfw.wmnet
* 15:14 urbanecm@deploy2002: urbanecm: Continuing with sync
* 15:13 urbanecm@deploy2002: urbanecm: Backport for [[gerrit:1244723{{!}}cleanup: Growth: Remove temporary GrowthMentorList overrides (T418518)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:13 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 15:12 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2048.codfw.wmnet
* 15:11 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1244723{{!}}cleanup: Growth: Remove temporary GrowthMentorList overrides (T418518)]]
* 15:10 brennen@deploy2002: Finished deploy [phabricator/deployment@e845707]: deploy phab1004 for [[phab:T420366|T420366]] (duration: 01m 02s)
* 15:09 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be1005.eqiad.wmnet
* 15:09 brennen@deploy2002: Started deploy [phabricator/deployment@e845707]: deploy phab1004 for [[phab:T420366|T420366]]
* 15:09 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254217{{!}}Create dblists for wikis where CheckUser and AbuseFilter are disabled (T420063 T420062)]] (duration: 06m 38s)
* 15:09 brennen@deploy2002: Finished deploy [phabricator/deployment@e845707]: deploy phab2002 for [[phab:T420366|T420366]] (duration: 00m 35s)
* 15:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudservices2005-dev.codfw.wmnet
* 15:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 15:08 brennen@deploy2002: Started deploy [phabricator/deployment@e845707]: deploy phab2002 for [[phab:T420366|T420366]]
* 15:08 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2009.codfw.wmnet
* 15:05 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 15:05 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 15:05 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 15:04 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1254217{{!}}Create dblists for wikis where CheckUser and AbuseFilter are disabled (T420063 T420062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:04 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7014.magru.wmnet with OS trixie
* 15:03 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host phab1004.eqiad.wmnet
* 15:02 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1254217{{!}}Create dblists for wikis where CheckUser and AbuseFilter are disabled (T420063 T420062)]]
* 15:02 topranks: reset BGP session to ssw1-d8-eiqad from lsw1-d4-eqiad [[phab:T420180|T420180]]
* 15:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:02 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 15:02 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2003-dev.codfw.wmnet
* 15:00 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2009.codfw.wmnet
* 15:00 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2008.codfw.wmnet
* 14:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2034.codfw.wmnet
* 14:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2034.codfw.wmnet
* 14:57 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host phab1004.eqiad.wmnet
* 14:55 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2003-dev.codfw.wmnet
* 14:55 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2002-dev.codfw.wmnet
* 14:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum4004.ulsfo.wmnet
* 14:53 jmm@deploy2002: helmfile [staging] DONE helmfile.d/services/thumbor: apply
* 14:53 jmm@deploy2002: helmfile [staging] START helmfile.d/services/thumbor: apply
* 14:52 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2008.codfw.wmnet
* 14:52 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2007.codfw.wmnet
* 14:51 topranks: stop accepting routes on ssw1-d8-eqiad from external peers (cr2-eqiad, other spines) [[phab:T420351|T420351]]
* 14:51 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 14:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host durum4004.ulsfo.wmnet
* 14:50 topranks: stop announcing routes from ssw1-d8-eqiad to external peers (cr2-eqiad, other spines) [[phab:T420351|T420351]]
* 14:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2034.codfw.wmnet
* 14:48 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2002-dev.codfw.wmnet
* 14:48 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudrabbit2001-dev.codfw.wmnet
* 14:46 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2007.codfw.wmnet
* 14:46 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2006.codfw.wmnet
* 14:45 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 14:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:44 taavi: deploying cr firewall changes from https://gerrit.wikimedia.org/r/c/operations/homer/public/+/1254211
* 14:44 topranks: stop announcing "direct" routes to ssw1-d8-eqiad from cr2-eqiad [[phab:T420351|T420351]]
* 14:44 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2034.codfw.wmnet
* 14:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2034.codfw.wmnet
* 14:43 moritzm: failover Ganeti master in codfw to ganeti2047
* 14:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2050.codfw.wmnet
* 14:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2050.codfw.wmnet
* 14:41 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudrabbit2001-dev.codfw.wmnet
* 14:41 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2007-dev.codfw.wmnet
* 14:40 topranks: disabling EVPN IBGP peering from ssw1-d8-eqiad to ssw1-d1-eqiad to stop them reflecting routes [[phab:T420351|T420351]]
* 14:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest1006.eqiad.wmnet
* 14:39 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 14:38 inflatador: bking@requestctl remove `wdqs_highest_error_rate_ever_seen` requestctl rule as it is no longer needed
* 14:38 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2006.codfw.wmnet
* 14:37 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host thanos-be2005.codfw.wmnet
* 14:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2050.codfw.wmnet
* 14:35 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2007-dev.codfw.wmnet
* 14:35 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2006-dev.codfw.wmnet
* 14:34 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest1006.eqiad.wmnet
* 14:34 Daimona: Creating ce_event_goals DB table for the CampaignEvents extension in x1.testwiki, x1.test2wiki, x1.officewiki, and x1.wikishared # [[phab:T411433|T411433]]
* 14:34 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2050.codfw.wmnet
* 14:33 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2049.codfw.wmnet
* 14:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2049.codfw.wmnet
* 14:31 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 14:30 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host thanos-be2005.codfw.wmnet
* 14:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2049.codfw.wmnet
* 14:27 topranks: de-pref internet circuits landing on cr2-eqiad to shift traffic to cr1 [[phab:T420351|T420351]]
* 14:27 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2006-dev.codfw.wmnet
* 14:27 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudnet2005-dev.codfw.wmnet
* 14:25 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2049.codfw.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2047.codfw.wmnet
* 14:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2047.codfw.wmnet
* 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-test-presto1001.eqiad.wmnet
* 14:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host an-test-presto1001.eqiad.wmnet
* 14:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2047.codfw.wmnet
* 14:19 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudnet2005-dev.codfw.wmnet
* 14:19 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb2004-dev.codfw.wmnet
* 14:17 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 14:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2047.codfw.wmnet
* 14:14 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2046.codfw.wmnet
* 14:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2046.codfw.wmnet
* 14:13 topranks: disable VRRP on cr2-eqiad interfaces facing ssw1-d8-eqiad [[phab:T420351|T420351]]
* 14:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 14:11 moritzm: powercycling ganeti2046 (stuck on reboot)
* 14:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
* 14:10 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudlb2004-dev.codfw.wmnet
* 14:10 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudlb2003-dev.codfw.wmnet
* 14:07 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
* 14:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 14:05 topranks: setting cr1-eqiad as VRRP master for all vlans [[phab:T420351|T420351]]
* 14:01 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudlb2003-dev.codfw.wmnet
* 14:00 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 13:59 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2046.codfw.wmnet
* 13:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6002.drmrs.wmnet
* 13:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6002.drmrs.wmnet
* 13:57 aklapper@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 13:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2046.codfw.wmnet
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2045.codfw.wmnet
* 13:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2045.codfw.wmnet
* 13:52 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudlb2002-dev.codfw.wmnet
* 13:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6002.drmrs.wmnet
* 13:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2045.codfw.wmnet
* 13:45 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254189{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]], [[gerrit:1254190{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]] (duration: 08m 10s)
* 13:44 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2003.codfw.wmnet
* 13:42 esanders@deploy2002: esanders: Continuing with sync
* 13:39 esanders@deploy2002: esanders: Backport for [[gerrit:1254189{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]], [[gerrit:1254190{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:38 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be2003.codfw.wmnet
* 13:38 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be2004.codfw.wmnet
* 13:37 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1254189{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]], [[gerrit:1254190{{!}}TitleWidget: Prioritise namespace prefix over interwiki prefix (T420288)]]
* 13:35 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on logstash2023.codfw.wmnet with reason: ganeti reboot
* 13:33 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 13:32 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host apus-be2004.codfw.wmnet
* 13:32 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2045.codfw.wmnet
* 13:32 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2002.codfw.wmnet
* 13:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6002.drmrs.wmnet
* 13:30 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251610{{!}}Turn on postprocessing cache for all Parsoid parses (T348255)]] (duration: 10m 31s)
* 13:28 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2045.codfw.wmnet
* 13:26 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be2002.codfw.wmnet
* 13:26 cscott@deploy2002: cscott: Continuing with sync
* 13:26 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be2001.codfw.wmnet
* 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6001.drmrs.wmnet
* 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6001.drmrs.wmnet
* 13:22 cscott@deploy2002: cscott: Backport for [[gerrit:1251610{{!}}Turn on postprocessing cache for all Parsoid parses (T348255)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2044.codfw.wmnet
* 13:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2044.codfw.wmnet
* 13:20 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be2001.codfw.wmnet
* 13:20 cgoubert@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker13[00-47].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 13:20 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1251610{{!}}Turn on postprocessing cache for all Parsoid parses (T348255)]]
* 13:20 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:19 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:19 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2280-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 13:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6001.drmrs.wmnet
* 13:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6001.drmrs.wmnet
* 13:16 btullis@cumin1003: END (PASS) - Cookbook sre.presto.reboot-workers (exit_code=0) for Presto an-presto cluster: Reboot Presto nodes
* 13:15 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1003.eqiad.wmnet
* 13:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2044.codfw.wmnet
* 13:15 aklapper@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254166{{!}}Remove misplaced readonly from CategoryViewer::$query (T420315)]] (duration: 06m 31s)
* 13:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint1001.eqiad.wmnet
* 13:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint1001.eqiad.wmnet
* 13:11 aklapper@deploy2002: zabe, aklapper: Continuing with sync
* 13:11 aklapper@deploy2002: zabe, aklapper: Backport for [[gerrit:1254166{{!}}Remove misplaced readonly from CategoryViewer::$query (T420315)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 otto@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:10 otto@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:10 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be1003.eqiad.wmnet
* 13:10 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 16509
* 13:09 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apus-be1004.eqiad.wmnet
* 13:09 aklapper@deploy2002: Started scap sync-world: Backport for [[gerrit:1254166{{!}}Remove misplaced readonly from CategoryViewer::$query (T420315)]]
* 13:08 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ldap-maint2001.codfw.wmnet
* 13:04 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host apus-be1004.eqiad.wmnet
* 13:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ldap-maint2001.codfw.wmnet
* 13:02 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2044.codfw.wmnet
* 13:02 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1002.eqiad.wmnet
* 13:01 moritzm: failover Ganeti masters in drmrs to ganeti6003/6004
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6004.drmrs.wmnet
* 13:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6004.drmrs.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2043.codfw.wmnet
* 12:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2043.codfw.wmnet
* 12:57 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 214657
* 12:56 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 214657
* 12:56 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 56308
* 12:55 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 12:55 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 56308
* 12:55 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 28788
* 12:55 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be1002.eqiad.wmnet
* 12:55 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host moss-be1001.eqiad.wmnet
* 12:54 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 28788
* 12:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6004.drmrs.wmnet
* 12:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2043.codfw.wmnet
* 12:53 ayounsi@cumin1003: END (FAIL) - Cookbook sre.network.peering (exit_code=99) with action 'email' for AS: 28788
* 12:53 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 28788
* 12:53 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:52 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.peering (exit_code=0) with action 'email' for AS: 9269
* 12:52 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1012
* 12:52 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:51 ayounsi@cumin1003: START - Cookbook sre.network.peering with action 'email' for AS: 9269
* 12:51 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1012
* 12:51 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:51 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-e8-eqiad
* 12:51 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device lsw1-e8-eqiad
* 12:50 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 12:48 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1015
* 12:48 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host moss-be1001.eqiad.wmnet
* 12:45 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1015
* 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6004.drmrs.wmnet
* 12:44 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:44 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2043.codfw.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti6003.drmrs.wmnet
* 12:42 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti6003.drmrs.wmnet
* 12:40 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 12:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2042.codfw.wmnet
* 12:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2042.codfw.wmnet
* 12:38 moritzm: powercycling ganeti2042 (stuck on reboot)
* 12:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti6003.drmrs.wmnet
* 12:34 moritzm: powercycling ganeti2041 (stuck on reboot)
* 12:31 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:24 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti6003.drmrs.wmnet
* 12:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install1005.wikimedia.org
* 12:22 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 12:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2042.codfw.wmnet
* 12:20 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-cluster
* 12:20 Emperor: roll-reboot apus frontends (codfw) for March reboots
* 12:16 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install1005.wikimedia.org
* 12:13 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install2005.wikimedia.org
* 12:13 topranks: restart BGP announcements from ssw1-d1-eqiad following change [[phab:T420180|T420180]]
* 12:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2042.codfw.wmnet
* 12:08 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[2280-2331].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 12:07 dbrant@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifeeds: apply
* 12:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install2005.wikimedia.org
* 12:06 dbrant@deploy2002: helmfile [codfw] START helmfile.d/services/wikifeeds: apply
* 12:06 dbrant@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifeeds: apply
* 12:06 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install3004.wikimedia.org
* 12:06 dbrant@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifeeds: apply
* 12:05 jayme@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=(registry1005.eqiad.wmnet{{!}}registry2005.codfw.wmnet)
* 12:05 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2005.codfw.wmnet
* 12:05 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1005.eqiad.wmnet
* 12:04 dbrant@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifeeds: apply
* 12:04 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install3004.wikimedia.org
* 12:04 dbrant@deploy2002: helmfile [staging] START helmfile.d/services/wikifeeds: apply
* 12:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install4003.wikimedia.org
* 12:03 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c7-eqiad [[phab:T420180|T420180]]
* 12:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2041.codfw.wmnet
* 12:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2041.codfw.wmnet
* 12:01 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1005.eqiad.wmnet
* 12:01 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2005.codfw.wmnet
* 12:00 jayme@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=(registry1005.eqiad.wmnet{{!}}registry2005.codfw.wmnet)
* 12:00 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c6-eqiad [[phab:T420180|T420180]]
* 12:00 jayme@cumin1003: conftool action : set/pooled=yes; selector: service=docker-registry,name=(registry1004.eqiad.wmnet{{!}}registry2004.codfw.wmnet)
* 11:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry1004.eqiad.wmnet
* 11:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host registry2004.codfw.wmnet
* 11:59 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c4-eqiad [[phab:T420180|T420180]]
* 11:58 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c3-eqiad [[phab:T420180|T420180]]
* 11:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install4003.wikimedia.org
* 11:56 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-c2-eqiad [[phab:T420180|T420180]]
* 11:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install5003.wikimedia.org
* 11:55 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry1004.eqiad.wmnet
* 11:55 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host registry2004.codfw.wmnet
* 11:54 jayme@cumin1003: conftool action : set/pooled=no; selector: service=docker-registry,name=(registry1004.eqiad.wmnet{{!}}registry2004.codfw.wmnet)
* 11:54 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-d3-eqiad [[phab:T420180|T420180]]
* 11:53 topranks: reset BGP session to ssw1-d1-eiqad from lsw1-d1-eqiad [[phab:T420180|T420180]]
* 11:52 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-cluster (exit_code=0)
* 11:52 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5007.eqsin.wmnet
* 11:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5007.eqsin.wmnet
* 11:49 btullis@cumin1003: START - Cookbook sre.hadoop.reboot-workers for Hadoop analytics cluster
* 11:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install5003.wikimedia.org
* 11:48 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 11:48 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install6003.wikimedia.org
* 11:47 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1012.eqiad.wmnet with OS bookworm
* 11:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install6003.wikimedia.org
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2041.codfw.wmnet
* 11:43 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5007.eqsin.wmnet
* 11:43 btullis@cumin1003: START - Cookbook sre.presto.reboot-workers for Presto an-presto cluster: Reboot Presto nodes
* 11:41 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5007.eqsin.wmnet
* 11:41 cgoubert@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker13[00-47].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 11:39 topranks: stop accepting external routes on ssw1-d1-eqiad from cr1-eqiad [[phab:T420180|T420180]]
* 11:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host install7002.wikimedia.org
* 11:33 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-cluster
* 11:33 Emperor: roll-reboot apus frontends (eqiad) for March reboots
* 11:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host install7002.wikimedia.org
* 11:28 moritzm: failover Ganeti master in eqsin to ganeti5004
* 11:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2041.codfw.wmnet
* 11:24 topranks: reduce local-preference for BGP routes learnt from servers on cr1-eqiad [[phab:T420180|T420180]]
* 11:22 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 11:18 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2040.codfw.wmnet
* 11:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2040.codfw.wmnet
* 11:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5006.eqsin.wmnet
* 11:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5006.eqsin.wmnet
* 11:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2040.codfw.wmnet
* 11:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5006.eqsin.wmnet
* 11:05 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 11:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:04 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 11:04 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5006.eqsin.wmnet
* 11:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2040.codfw.wmnet
* 11:01 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5005.eqsin.wmnet
* 11:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5005.eqsin.wmnet
* 11:00 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:59 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:58 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:58 topranks: prepend external BGP announcements from cr1-eqiad [[phab:T420180|T420180]]
* 10:57 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:56 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:56 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2039.codfw.wmnet
* 10:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2039.codfw.wmnet
* 10:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5005.eqsin.wmnet
* 10:52 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-eqiad
* 10:51 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:49 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:49 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:49 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5005.eqsin.wmnet
* 10:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2039.codfw.wmnet
* 10:46 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti5004.eqsin.wmnet
* 10:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti5004.eqsin.wmnet
* 10:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2039.codfw.wmnet
* 10:45 javiermonton@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 10:45 javiermonton@deploy2002: helmfile [codfw] START helmfile.d/services/mw-page-content-change-enrich: apply
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2038.codfw.wmnet
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2038.codfw.wmnet
* 10:43 javiermonton@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 10:43 javiermonton@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-page-content-change-enrich: apply
* 10:42 topranks: cease announcing routed networks from ssw1-d1-eqiad to cr1-eqiad in BGP [[phab:T420180|T420180]]
* 10:41 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:41 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:40 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:39 javiermonton@deploy2002: helmfile [staging] DONE helmfile.d/services/mw-page-content-change-enrich: apply
* 10:39 javiermonton@deploy2002: helmfile [staging] START helmfile.d/services/mw-page-content-change-enrich: apply
* 10:38 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2038.codfw.wmnet
* 10:37 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti5004.eqsin.wmnet
* 10:37 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:33 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw2004-dev.codfw.wmnet
* 10:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2038.codfw.wmnet
* 10:29 topranks: stop announcing directly connected routes to L3 switches from cr1-eqiad [[phab:T420180|T420180]]
* 10:28 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:27 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudgw2004-dev.codfw.wmnet
* 10:27 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw2003-dev.codfw.wmnet
* 10:26 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:25 topranks: disable EVPN IBGP peering between ssw1-d1-eqiad and ssw1-d8-eqiad [[phab:T420180|T420180]]
* 10:24 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:21 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:21 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 10:20 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudgw2003-dev.codfw.wmnet
* 10:20 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:19 urbanecm: Delete `job/growthexperiments-listtaskcounts-29513771` from mw-cron (job stuck for more than a month)
* 10:19 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid1003.eqiad.wmnet
* 10:16 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti5004.eqsin.wmnet
* 10:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2037.codfw.wmnet
* 10:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2037.codfw.wmnet
* 10:12 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid1003.eqiad.wmnet
* 10:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2002.codfw.wmnet
* 10:09 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2037.codfw.wmnet
* 10:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2002.codfw.wmnet
* 10:06 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2002.codfw.wmnet
* 10:05 topranks: disabling VRRP for et-1/0/5 sub-interfaces on cr1-eqiad [[phab:T420180|T420180]]
* 10:03 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 10:03 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:02 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:01 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:01 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 10:01 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 10:00 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2002.codfw.wmnet
* 10:00 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf2001.codfw.wmnet
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:59 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:58 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host failoid2003.codfw.wmnet
* 09:57 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:57 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 09:57 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich-next: apply
* 09:56 topranks: shift traffic from codfw to eqiad off Arelion CCT to Lumen
* 09:56 mvernon@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies (exit_code=0) rolling reboot on A:thanos-fe
* 09:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host failoid2003.codfw.wmnet
* 09:54 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf2001.codfw.wmnet
* 09:54 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1002.eqiad.wmnet
* 09:53 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2037.codfw.wmnet
* 09:53 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 09:52 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:50 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:47 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1002.eqiad.wmnet
* 09:47 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-wf1001.eqiad.wmnet
* 09:42 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 09:42 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-wf1001.eqiad.wmnet
* 09:40 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2002.codfw.wmnet
* 09:38 moritzm: installing openssl bugfix updates on trixie hosts
* 09:31 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2002.codfw.wmnet
* 09:31 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc2001.codfw.wmnet
* 09:25 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc2001.codfw.wmnet
* 09:25 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1002.eqiad.wmnet
* 09:21 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-eqiad
* 09:20 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1002.eqiad.wmnet
* 09:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 09:15 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 09:10 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1254114{{!}}Revert "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] (duration: 12m 36s)
* 09:06 topranks: increase VRRP priority on eqiad vlans on CR2 to shift active gateway to cr2-eqiad [[phab:T420180|T420180]]
* 09:05 mvernon@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-thanos-proxies rolling reboot on A:thanos-fe
* 09:03 kharlan@deploy2002: kharlan: Continuing with sync
* 09:02 kharlan@deploy2002: kharlan: Backport for [[gerrit:1254114{{!}}Revert "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:58 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-canary
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2036.codfw.wmnet
* 08:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2036.codfw.wmnet
* 08:57 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1254114{{!}}Revert "hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend" (T419125)]]
* 08:57 moritzm: rebuilt the trixie d-i image for the 13.4 point release [[phab:T420240|T420240]]
* 08:54 kharlan@deploy2002: Sync cancelled.
* 08:52 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-canary
* 08:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2036.codfw.wmnet
* 08:49 kharlan@deploy2002: harroyo-wmf, kharlan: Backport for [[gerrit:1250575{{!}}hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend (T419125)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:45 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2036.codfw.wmnet
* 08:44 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host bast2003.wikimedia.org
* 08:43 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1250575{{!}}hcaptcha: Enforce hCaptcha on API edits coming from the MobileFrontend (T419125)]]
* 08:42 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:42 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:35 arnaudb@cumin1003: END (PASS) - Cookbook sre.gerrit.restart-gerrit (exit_code=0) Restarting Gerrit on gerrit2002
* 08:34 arnaudb@cumin1003: START - Cookbook sre.gerrit.restart-gerrit Restarting Gerrit on gerrit2002
* 08:34 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host mc-misc1001.eqiad.wmnet
* 08:34 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host contint1002.wikimedia.org
* 08:32 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:28 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host mc-misc1001.eqiad.wmnet
* 08:27 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host contint1002.wikimedia.org
* 08:24 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2035.codfw.wmnet
* 08:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2035.codfw.wmnet
* 08:20 jiji@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host parsoidtest1001.eqiad.wmnet
* 08:18 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2035.codfw.wmnet
* 08:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti3005.esams.wmnet to cluster esams03 and group B
* 08:14 moritzm: powercycling bast2003 (stuck on reboot)
* 08:14 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti3005.esams.wmnet to cluster esams03 and group B
* 08:14 jiji@cumin1003: START - Cookbook sre.hosts.reboot-single for host parsoidtest1001.eqiad.wmnet
* 08:12 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3005.esams.wmnet
* 08:09 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:08 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 08:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet
* 07:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast2003.wikimedia.org
* 07:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host bast5004.wikimedia.org
* 07:52 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti3005.esams.wmnet with OS bookworm
* 07:46 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast5004.wikimedia.org
* 07:37 jiji@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 07:37 jiji@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 07:34 arnaudb@cumin1003: END (PASS) - Cookbook sre.gerrit.restart-gerrit (exit_code=0) Restarting Gerrit on gerrit2003
* 07:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2035.codfw.wmnet
* 07:32 arnaudb@cumin1003: START - Cookbook sre.gerrit.restart-gerrit Restarting Gerrit on gerrit2003
* 07:32 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2033.codfw.wmnet
* 07:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2033.codfw.wmnet
* 07:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2032.codfw.wmnet
* 07:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2032.codfw.wmnet
* 07:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti3005.esams.wmnet with reason: host reimage
* 07:25 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti3005.esams.wmnet with reason: host reimage
* 07:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2032.codfw.wmnet
* 07:23 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2032.codfw.wmnet
* 07:00 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti3005.esams.wmnet with OS bookworm
* 06:08 kart_: Updated cxserver to 2026-03-16-071247-production ([[phab:T420004|T420004]])
* 06:07 kartik@deploy2002: helmfile [eqiad] DONE helmfile.d/services/cxserver: apply
* 06:06 kartik@deploy2002: helmfile [eqiad] START helmfile.d/services/cxserver: apply
* 06:05 kartik@deploy2002: helmfile [codfw] DONE helmfile.d/services/cxserver: apply
* 06:04 kartik@deploy2002: helmfile [codfw] START helmfile.d/services/cxserver: apply
* 05:58 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/cxserver: apply
* 05:58 kartik@deploy2002: helmfile [staging] START helmfile.d/services/cxserver: apply
* 04:41 dwisehaupt@dns1005: END - running authdns-update
* 04:39 dwisehaupt@dns1005: START - running authdns-update
* 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.17 (duration: 01m 17s)
* 03:43 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.20 refs [[phab:T413811|T413811]] (duration: 39m 34s)
* 03:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.20 refs [[phab:T413811|T413811]]
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 10s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 00:26 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6009.*
* 00:25 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6009.drmrs.wmnet with OS trixie
* 00:07 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251158{{!}}Enable languages in main menu on Russian Wikipedia (T419730)]] (duration: 06m 57s)
* 00:03 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 00:02 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1251158{{!}}Enable languages in main menu on Russian Wikipedia (T419730)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:00 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1251158{{!}}Enable languages in main menu on Russian Wikipedia (T419730)]]
== 2026-03-16 ==
* 23:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6009.drmrs.wmnet with reason: host reimage
* 23:56 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253604{{!}}Don't output language HTML when no languages present (T419730)]], [[gerrit:1251157{{!}}Support duplication of languages in header and main menu (T419730)]] (duration: 06m 44s)
* 23:56 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6009.drmrs.wmnet with reason: host reimage
* 23:52 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 23:51 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1253604{{!}}Don't output language HTML when no languages present (T419730)]], [[gerrit:1251157{{!}}Support duplication of languages in header and main menu (T419730)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:50 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1253604{{!}}Don't output language HTML when no languages present (T419730)]], [[gerrit:1251157{{!}}Support duplication of languages in header and main menu (T419730)]]
* 23:36 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6009.drmrs.wmnet with OS trixie
* 23:32 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp601(0{{!}}1).*
* 22:54 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6008.drmrs.wmnet [reason: trixie reimaging]
* 22:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6008.drmrs.wmnet with OS trixie
* 22:37 jforrester@deploy2002: Finished scap sync-world: [[phab:T411807|T411807]] (duration: 11m 10s)
* 22:35 jforrester@deploy2002: jforrester: Continuing with sync
* 22:32 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6010.drmrs.wmnet with OS trixie
* 22:31 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp70[09-12].magru.wmnet<nowiki>}</nowiki> and A:cp
* 22:31 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7012.magru.wmnet
* 22:31 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 22:30 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp700[1-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 22:30 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7004.magru.wmnet
* 22:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6011.drmrs.wmnet with OS trixie
* 22:28 jforrester@deploy2002: jforrester: [[phab:T411807|T411807]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:27 jforrester@deploy2002: Started scap sync-world: [[phab:T411807|T411807]]
* 22:27 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1285-1289,1291-1299].eqiad.wmnet
* 22:24 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 22:20 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 22:17 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.reboot-nodes (exit_code=1) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1020-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 22:07 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6010.drmrs.wmnet with reason: host reimage
* 22:05 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6007.drmrs.wmnet [reason: trixie reimaging]
* 22:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6011.drmrs.wmnet with reason: host reimage
* 22:03 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6007.drmrs.wmnet with OS trixie
* 22:02 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6008.drmrs.wmnet with OS trixie
* 21:59 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6010.drmrs.wmnet with reason: host reimage
* 21:58 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6011.drmrs.wmnet with reason: host reimage
* 21:58 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp6008.drmrs.wmnet with OS trixie
* 21:52 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7011.magru.wmnet
* 21:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7003.magru.wmnet
* 21:42 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host zuul1003.eqiad.wmnet with OS trixie
* 21:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 21:40 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6010.drmrs.wmnet with OS trixie
* 21:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6012.*
* 21:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6012.drmrs.wmnet with OS trixie
* 21:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6011.drmrs.wmnet with OS trixie
* 21:38 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6007.drmrs.wmnet with reason: host reimage
* 21:36 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6013.*
* 21:36 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6008.drmrs.wmnet with reason: host reimage
* 21:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6013.drmrs.wmnet with OS trixie
* 21:32 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6007.drmrs.wmnet with reason: host reimage
* 21:28 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on zuul1003.eqiad.wmnet with reason: host reimage
* 21:22 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on zuul1003.eqiad.wmnet with reason: host reimage
* 21:19 Dreamy_Jazz: Evening UTC backport window done
* 21:18 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251848{{!}}Disable CheckUser on closed wikis where no checks were ever made (T420062)]], [[gerrit:1251865{{!}}Uninstall SecurePoll from closed wikis (T420062)]], [[gerrit:1251888{{!}}DiscussionTools: Uninstall wikis closed before permalinks were deployed (T420052)]] (duration: 06m 10s)
* 21:17 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6008.drmrs.wmnet with OS trixie
* 21:16 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6008.drmrs.wmnet [reason: trixie reimaging]
* 21:15 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6006.drmrs.wmnet [reason: trixie reimaging]
* 21:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6006.drmrs.wmnet with OS trixie
* 21:14 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 21:14 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1251848{{!}}Disable CheckUser on closed wikis where no checks were ever made (T420062)]], [[gerrit:1251865{{!}}Uninstall SecurePoll from closed wikis (T420062)]], [[gerrit:1251888{{!}}DiscussionTools: Uninstall wikis closed before permalinks were deployed (T420052)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified the
* 21:13 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6012.drmrs.wmnet with reason: host reimage
* 21:12 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6007.drmrs.wmnet with OS trixie
* 21:12 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1251848{{!}}Disable CheckUser on closed wikis where no checks were ever made (T420062)]], [[gerrit:1251865{{!}}Uninstall SecurePoll from closed wikis (T420062)]], [[gerrit:1251888{{!}}DiscussionTools: Uninstall wikis closed before permalinks were deployed (T420052)]]
* 21:12 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6007.drmrs.wmnet [reason: trixie reimaging]
* 21:11 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6005.drmrs.wmnet [reason: trixie reimaging]
* 21:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6005.drmrs.wmnet with OS trixie
* 21:10 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7010.magru.wmnet
* 21:10 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7002.magru.wmnet
* 21:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6013.drmrs.wmnet with reason: host reimage
* 21:08 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host zuul1003.eqiad.wmnet with OS trixie
* 21:07 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6012.drmrs.wmnet with reason: host reimage
* 21:06 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6013.drmrs.wmnet with reason: host reimage
* 21:05 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253623{{!}}Fix client credentials access tokens (T417278 T419921)]], [[gerrit:1253625{{!}}Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (T414338)]], [[gerrit:1253626{{!}}Configure $wgApiClientErrorSampleRate (T418957)]] (duration: 08m 06s)
* 21:01 catrope@deploy2002: matmarex, catrope: Continuing with sync
* 20:59 catrope@deploy2002: matmarex, catrope: Backport for [[gerrit:1253623{{!}}Fix client credentials access tokens (T417278 T419921)]], [[gerrit:1253625{{!}}Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (T414338)]], [[gerrit:1253626{{!}}Configure $wgApiClientErrorSampleRate (T418957)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:57 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1253623{{!}}Fix client credentials access tokens (T417278 T419921)]], [[gerrit:1253625{{!}}Enable $wgTrackMediaRequestProvenance on testwikis and beta cluster (T414338)]], [[gerrit:1253626{{!}}Configure $wgApiClientErrorSampleRate (T418957)]]
* 20:54 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:54 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cp[2027-2040].codfw.wmnet
* 20:50 brett@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:50 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2027-2040].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:50 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cp[2027-2040].codfw.wmnet decommissioned, removing all IPs except the asset tag one - brett@cumin2002"
* 20:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage
* 20:48 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6012.drmrs.wmnet with OS trixie
* 20:46 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6013.drmrs.wmnet with OS trixie
* 20:45 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2042.codfw.wmnet with reason: Testing hosts - not for production
* 20:45 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6005.drmrs.wmnet with reason: host reimage
* 20:44 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2042.codfw.wmnet with OS trixie
* 20:44 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudcephmon2007-dev.codfw.wmnet with OS bookworm
* 20:44 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 20:44 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253566{{!}}Configure external link aggregate usage on 12 wikis for top domains (T419837)]] (duration: 06m 59s)
* 20:43 brett@cumin2002: START - Cookbook sre.dns.netbox
* 20:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6006.drmrs.wmnet with reason: host reimage
* 20:41 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2041.codfw.wmnet with reason: Testing hosts - not for production
* 20:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6005.drmrs.wmnet with reason: host reimage
* 20:40 kharlan@deploy2002: kharlan, mszwarc: Continuing with sync
* 20:39 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2041.codfw.wmnet with OS trixie
* 20:39 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 20:38 kharlan@deploy2002: kharlan, mszwarc: Backport for [[gerrit:1253566{{!}}Configure external link aggregate usage on 12 wikis for top domains (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:37 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1253566{{!}}Configure external link aggregate usage on 12 wikis for top domains (T419837)]]
* 20:34 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6014.*
* 20:33 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:33 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:32 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253551{{!}}Fix double post-processing in legacy preview case (T419908)]] (duration: 06m 52s)
* 20:29 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7009.magru.wmnet
* 20:28 cscott@deploy2002: cscott: Continuing with sync
* 20:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp7001.magru.wmnet
* 20:27 cscott@deploy2002: cscott: Backport for [[gerrit:1253551{{!}}Fix double post-processing in legacy preview case (T419908)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:26 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1253551{{!}}Fix double post-processing in legacy preview case (T419908)]]
* 20:22 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 20:22 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6006.drmrs.wmnet with OS trixie
* 20:21 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6006.drmrs.wmnet [reason: trixie reimaging]
* 20:21 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6004.drmrs.wmnet [reason: trixie reimaging]
* 20:21 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6005.drmrs.wmnet with OS trixie
* 20:20 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6004.drmrs.wmnet with OS trixie
* 20:20 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6005.drmrs.wmnet [reason: trixie reimaging]
* 20:19 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6003.drmrs.wmnet [reason: trixie reimaging]
* 20:19 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudcephmon2007-dev.codfw.wmnet with reason: host reimage
* 20:19 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp70[09-12].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:18 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp700[1-4].magru.wmnet<nowiki>}</nowiki> and A:cp
* 20:17 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248665{{!}}Enable passwordless login in production (T419198)]], [[gerrit:1253572{{!}}Instrument clicks on external links to selected domains (T419837)]] (duration: 06m 43s)
* 20:16 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:15 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 20:15 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudcephmon2007-dev.codfw.wmnet with reason: host reimage
* 20:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6003.drmrs.wmnet with OS trixie
* 20:13 catrope@deploy2002: kharlan, catrope: Continuing with sync
* 20:12 xcollazo@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-content-history-reconcile-enrich: apply
* 20:12 catrope@deploy2002: kharlan, catrope: Backport for [[gerrit:1248665{{!}}Enable passwordless login in production (T419198)]], [[gerrit:1253572{{!}}Instrument clicks on external links to selected domains (T419837)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:12 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 20:11 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 20:10 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1248665{{!}}Enable passwordless login in production (T419198)]], [[gerrit:1253572{{!}}Instrument clicks on external links to selected domains (T419837)]]
* 20:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6014.drmrs.wmnet with OS trixie
* 20:03 brett@cumin2002: START - Cookbook sre.hosts.decommission for hosts cp[2027-2040].codfw.wmnet
* 20:01 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251589{{!}}Uninstall GlobalBlocking from closed wikis (T420062)]] (duration: 08m 20s)
* 19:57 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 19:55 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6004.drmrs.wmnet with reason: host reimage
* 19:55 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephmon2007-dev.codfw.wmnet with OS bookworm
* 19:54 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1251589{{!}}Uninstall GlobalBlocking from closed wikis (T420062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:54 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp2042.codfw.wmnet with OS trixie
* 19:53 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host cloudcephmon2007-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:53 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS trixie
* 19:52 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1251589{{!}}Uninstall GlobalBlocking from closed wikis (T420062)]]
* 19:52 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host cloudcephmon2007-dev.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 19:51 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251582{{!}}Uninstall AbuseFilter from closed wikis with no AbuseFilter logs (T420063)]] (duration: 09m 26s)
* 19:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6003.drmrs.wmnet with reason: host reimage
* 19:47 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 19:47 mutante: releases2003 - rm rsync-srv-org-wikimedia-releases-releases2003.* - alerts flapping since server reboot - puppet code needs to be improved to ensure units are removed when primary server is switched ([[phab:T420246|T420246]])
* 19:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6004.drmrs.wmnet with reason: host reimage
* 19:46 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6003.drmrs.wmnet with reason: host reimage
* 19:44 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1251582{{!}}Uninstall AbuseFilter from closed wikis with no AbuseFilter logs (T420063)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6014.drmrs.wmnet with reason: host reimage
* 19:42 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1251582{{!}}Uninstall AbuseFilter from closed wikis with no AbuseFilter logs (T420063)]]
* 19:41 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host cloudcephmon2007-dev
* 19:41 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host cloudcephmon2007-dev
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 19:40 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating cloudcephmon2007-dev in codfw - jhancock@cumin2002"
* 19:40 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6014.drmrs.wmnet with reason: host reimage
* 19:39 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253622{{!}}Revert "Media: Use previous step for non-standard width between steps and original" (T419927)]] (duration: 07m 10s)
* 19:35 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 19:34 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1253622{{!}}Revert "Media: Use previous step for non-standard width between steps and original" (T419927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:32 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: updating cloudcephmon2007-dev in codfw - jhancock@cumin2002"
* 19:32 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253622{{!}}Revert "Media: Use previous step for non-standard width between steps and original" (T419927)]]
* 19:28 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp404[5-6].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 19:28 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet
* 19:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6004.drmrs.wmnet with OS trixie
* 19:27 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 19:27 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6004.drmrs.wmnet [reason: trixie reimaging]
* 19:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6003.drmrs.wmnet with OS trixie
* 19:26 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6003.drmrs.wmnet [reason: trixie reimaging]
* 19:25 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6002.drmrs.wmnet [reason: trixie reimaging]
* 19:25 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp6001.drmrs.wmnet [reason: trixie reimaging]
* 19:21 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6014.drmrs.wmnet with OS trixie
* 19:17 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6002.drmrs.wmnet with OS trixie
* 19:17 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2042.codfw.wmnet with reason: Testing hosts - not for production
* 19:16 fabfur@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 30 days, 0:00:00 on cp2041.codfw.wmnet with reason: Testing hosts - not for production
* 19:15 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:15 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 19:12 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6001.drmrs.wmnet with OS trixie
* 19:02 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 19:02 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 18:57 cdobbins@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp4046.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:57 cdobbins@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4046.ulsfo.wmnet
* 18:52 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6002.drmrs.wmnet with reason: host reimage
* 18:49 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4045.ulsfo.wmnet
* 18:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6001.drmrs.wmnet with reason: host reimage
* 18:47 cdobbins@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp4046.ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6002.drmrs.wmnet with reason: host reimage
* 18:45 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6001.drmrs.wmnet with reason: host reimage
* 18:39 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp404[5-6].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:38 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6015.*
* 18:38 brett@cumin2002: END (PASS) - Cookbook sre.cdn.roll-reboot (exit_code=0) rolling reboot on P<nowiki>{</nowiki>cp403[7-9].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 18:38 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4039.ulsfo.wmnet
* 18:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6015.drmrs.wmnet with OS trixie
* 18:27 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6002.drmrs.wmnet with OS trixie
* 18:26 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6002.drmrs.wmnet [reason: trixie reimaging]
* 18:26 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp6001.drmrs.wmnet with OS trixie
* 18:24 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp6001.drmrs.wmnet [reason: trixie reimaging]
* 18:03 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage
* 17:59 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6015.drmrs.wmnet with reason: host reimage
* 17:58 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4038.ulsfo.wmnet
* 17:39 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6015.drmrs.wmnet with OS trixie
* 17:37 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp6016.*
* 17:32 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2042.codfw.wmnet with OS trixie
* 17:18 brett@cumin2002: cookbooks.sre.cdn.roll-reboot finished rebooting cp4037.ulsfo.wmnet
* 17:08 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 17:06 brett@cumin2002: START - Cookbook sre.cdn.roll-reboot rolling reboot on P<nowiki>{</nowiki>cp403[7-9].ulsfo.wmnet<nowiki>}</nowiki> and A:cp
* 17:03 fabfur@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2042.codfw.wmnet with reason: host reimage
* 17:02 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp6016.drmrs.wmnet with OS trixie
* 16:57 mutante: contint2002 - rebooting
* 16:47 mutante: phab2002 - rebooting
* 16:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:44 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253520{{!}}Add APCOND_OATH_HAS2FA to UserRequirementsPrivateConditions]] (duration: 06m 15s)
* 16:42 mutante: rebooting backends of releases.wikimedia.org
* 16:42 fabfur@cumin1003: START - Cookbook sre.hosts.reimage for host cp2042.codfw.wmnet with OS trixie
* 16:41 fabfur: reimage cp2042 for HAProxy testing ([[phab:T419825|T419825]])
* 16:41 mszwarc@deploy2002: mszwarc: Continuing with sync
* 16:40 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1253520{{!}}Add APCOND_OATH_HAS2FA to UserRequirementsPrivateConditions]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:39 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2041.codfw.wmnet with OS trixie
* 16:38 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1253520{{!}}Add APCOND_OATH_HAS2FA to UserRequirementsPrivateConditions]]
* 16:37 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1020-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 16:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp6016.drmrs.wmnet with reason: host reimage
* 16:32 milimetric: my bad, accidentally merged https://gerrit.wikimedia.org/r/c/operations/mediawiki-config/+/1250249, will read docs on config deployment better
* 16:31 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1012.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:29 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1012
* 16:27 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1012
* 16:27 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp6016.drmrs.wmnet with reason: host reimage
* 16:20 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]] (duration: 07m 28s)
* 16:17 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode2001.codfw.wmnet
* 16:16 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 16:14 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:13 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2005.codfw.wmnet
* 16:12 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]]
* 16:12 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode2001.codfw.wmnet
* 16:11 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dragonfly-supernode1001.eqiad.wmnet
* 16:11 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=codfw
* 16:11 fabfur@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 16:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum2001.codfw.wmnet
* 16:09 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1024.eqiad.wmnet
* 16:09 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1024.eqiad.wmnet
* 16:09 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1024.eqiad.wmnet
* 16:07 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1004-1007,1011-1012,1015-1016,1019-1021,1029-1031,1034-1168,1240-1289,1291-1327].eqiad.wmnet
* 16:06 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1004-1007,1011-1012,1015-1016,1019-1021,1029-1031,1034-1168,1240-1289,1291-1327].eqiad.wmnet
* 16:06 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp6016.drmrs.wmnet with OS trixie
* 16:06 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2005.codfw.wmnet
* 16:06 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host dragonfly-supernode1001.eqiad.wmnet
* 16:05 dwisehaupt@dns1006: END - running authdns-update
* 16:05 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum2001.codfw.wmnet
* 16:05 fabfur@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2041.codfw.wmnet with reason: host reimage
* 16:04 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=codfw
* 16:04 dwisehaupt@dns1006: START - running authdns-update
* 16:04 jayme@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=helm-charts,name=eqiad
* 16:00 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1004-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 15:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host chartmuseum1001.eqiad.wmnet
* 15:59 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2031.codfw.wmnet
* 15:59 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti2031.codfw.wmnet
* 15:54 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2004.codfw.wmnet
* 15:53 jayme@cumin1003: START - Cookbook sre.hosts.reboot-single for host chartmuseum1001.eqiad.wmnet
* 15:52 jayme@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=helm-charts,name=eqiad
* 15:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncmonitor1001.eqiad.wmnet
* 15:47 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2004.codfw.wmnet
* 15:47 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2031.codfw.wmnet
* 15:47 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncmonitor1001.eqiad.wmnet
* 15:46 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1024.eqiad.wmnet with reason: Rebooting clouddb1024 [[phab:T419960|T419960]]
* 15:44 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1024.eqiad.wmnet
* 15:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2031.codfw.wmnet
* 15:43 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1023.eqiad.wmnet
* 15:43 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1023.eqiad.wmnet
* 15:43 fabfur@cumin1003: START - Cookbook sre.hosts.reimage for host cp2041.codfw.wmnet with OS trixie
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2030.codfw.wmnet
* 15:43 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2030.codfw.wmnet
* 15:42 fabfur: reimage cp2041 for HAProxy testing ([[phab:T419825|T419825]])
* 15:42 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:41 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:41 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2003.codfw.wmnet
* 15:39 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:38 jmm@cumin2002: END (PASS) - Cookbook sre.ldap.roll-restart-reboot-replica (exit_code=0) rolling reboot on A:ldap-replicas-codfw
* 15:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:38 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:37 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Rebooting clouddb1023 [[phab:T419960|T419960]]
* 15:35 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1022.eqiad.wmnet
* 15:35 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1022.eqiad.wmnet
* 15:35 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2030.codfw.wmnet
* 15:32 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2003.codfw.wmnet
* 15:32 dwisehaupt@dns1006: END - running authdns-update
* 15:32 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2002.codfw.wmnet
* 15:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2030.codfw.wmnet
* 15:31 dwisehaupt@dns1006: START - running authdns-update
* 15:27 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe-codfw
* 15:26 jmm@cumin2002: START - Cookbook sre.ldap.roll-restart-reboot-replica rolling reboot on A:ldap-replicas-codfw
* 15:26 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti2029.codfw.wmnet
* 15:26 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti2029.codfw.wmnet
* 15:26 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:26 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:24 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:24 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 15:22 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2002.codfw.wmnet
* 15:21 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1023.eqiad.wmnet with reason: Rebooting clouddb1023 [[phab:T419960|T419960]]
* 15:20 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl2001.codfw.wmnet
* 15:20 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]]
* 15:16 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1022.eqiad.wmnet with reason: Rebooting clouddb1022 [[phab:T419960|T419960]]
* 15:13 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2029.codfw.wmnet
* 15:11 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-durum (exit_code=0) rolling reboot on A:durum and A:durum
* 15:09 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2029.codfw.wmnet
* 15:04 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl2001.codfw.wmnet
* 15:03 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:03 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:02 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling reboot on A:wikidough
* 15:01 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudgw1004.eqiad.wmnet
* 15:01 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 15:00 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2028.codfw.wmnet
* 15:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2028.codfw.wmnet
* 14:56 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:55 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:55 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:54 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1253518{{!}}Revert "mediawiki.util: Prefer prev step over non-standard in adjustThumbWidthForSteps" (T419927)]]
* 14:53 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudgw1004.eqiad.wmnet
* 14:53 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-codfw
* 14:52 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2028.codfw.wmnet
* 14:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2028.codfw.wmnet
* 14:51 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:50 mvernon@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling reboot on A:swift-fe-eqiad
* 14:34 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw1003.eqiad.wmnet with OS trixie
* 14:31 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:30 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:26 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:22 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1002-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:22 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1002-1003].eqiad.wmnet
* 14:22 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1002-1003].eqiad.wmnet
* 14:21 blake@cumin1003: END (ERROR) - Cookbook sre.k8s.reboot-nodes (exit_code=97) rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1002-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:21 blake@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on P<nowiki>{</nowiki>wikikube-worker[1002-1327].eqiad.wmnet<nowiki>}</nowiki> and (A:wikikube-master-eqiad or A:wikikube-worker-eqiad)
* 14:20 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:20 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:20 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:18 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253461{{!}}fix(anon warning): remove wring type=signup param (T415160)]], [[gerrit:1253450{{!}}AccountCreation: track account registrations for WE1.8 experiments (T416100)]] (duration: 09m 16s)
* 14:17 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:17 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:14 sgimeno@deploy2002: sgimeno: Continuing with sync
* 14:13 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:13 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:11 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:11 sgimeno@deploy2002: sgimeno: Backport for [[gerrit:1253461{{!}}fix(anon warning): remove wring type=signup param (T415160)]], [[gerrit:1253450{{!}}AccountCreation: track account registrations for WE1.8 experiments (T416100)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:11 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 14:10 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw1003.eqiad.wmnet with reason: host reimage
* 14:09 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1253461{{!}}fix(anon warning): remove wring type=signup param (T415160)]], [[gerrit:1253450{{!}}AccountCreation: track account registrations for WE1.8 experiments (T416100)]]
* 14:08 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2026.codfw.wmnet
* 14:08 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2026.codfw.wmnet
* 14:08 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:04 arnaudb@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on gerrit2002.wikimedia.org with reason: testing
* 14:03 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw1003.eqiad.wmnet with reason: host reimage
* 14:02 arnaudb@cumin1003: DONE (ERROR) - Cookbook sre.hosts.downtime (exit_code=97) for 2:00:00 on gerrit2002.wikimedia.org with reason: [[phab:T418256|T418256]]
* 14:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1003.eqiad.wmnet
* 13:58 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 13:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2026.codfw.wmnet
* 13:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1003.eqiad.wmnet
* 13:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow1002.eqiad.wmnet
* 13:45 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253046{{!}}bowiki: update logos (T419268)]] (duration: 06m 17s)
* 13:45 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudgw1003.eqiad.wmnet with OS trixie
* 13:43 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-codfw
* 13:43 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 13:41 mszwarc@deploy2002: mszwarc, anzx: Continuing with sync
* 13:41 mszwarc@deploy2002: mszwarc, anzx: Backport for [[gerrit:1253046{{!}}bowiki: update logos (T419268)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow1002.eqiad.wmnet
* 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2004.codfw.wmnet
* 13:39 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1253046{{!}}bowiki: update logos (T419268)]]
* 13:38 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1253423{{!}}Always use external actor for interwiki rights logs on target wiki (T6055)]] (duration: 08m 53s)
* 13:34 mszwarc@deploy2002: mszwarc: Continuing with sync
* 13:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2004.codfw.wmnet
* 13:32 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2026.codfw.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti2025.codfw.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti2025.codfw.wmnet
* 13:31 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow2003.codfw.wmnet
* 13:31 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1253423{{!}}Always use external actor for interwiki rights logs on target wiki (T6055)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:30 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 13:29 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1253423{{!}}Always use external actor for interwiki rights logs on target wiki (T6055)]]
* 13:28 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 13:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow2003.codfw.wmnet
* 13:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow3004.esams.wmnet
* 13:25 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling reboot on A:wikidough
* 13:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti2025.codfw.wmnet
* 13:22 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-durum rolling reboot on A:durum and A:durum
* 13:21 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti2025.codfw.wmnet
* 13:21 XioNoX: drain edgeuno transit for optic replacement - [[phab:T415743|T415743]]
* 13:19 cgoubert@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host wikikube-ctrl1004.eqiad.wmnet
* 13:19 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow3004.esams.wmnet
* 13:14 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251487{{!}}Replace direct BagOStuff with WANObjectCache (T419666)]] (duration: 11m 25s)
* 13:11 bwojtowicz@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:11 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet
* 13:09 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti3005.esams.wmnet
* 13:09 jmm@cumin2002: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host ganeti3005.esams.wmnet
* 13:07 jforrester@deploy2002: jforrester: Continuing with sync
* 13:06 jforrester@deploy2002: jforrester: Backport for [[gerrit:1251487{{!}}Replace direct BagOStuff with WANObjectCache (T419666)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:05 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl1004.eqiad.wmnet
* 13:04 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ncredir4002.ulsfo.wmnet
* 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:04 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 13:03 jiji@cumin1003: END (ERROR) - Cookbook sre.memcached.roll-reboot-restart (exit_code=97) rolling reboot on A:memcached-gutter-eqiad
* 13:03 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 13:03 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1251487{{!}}Replace direct BagOStuff with WANObjectCache (T419666)]]
* 13:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet
* 12:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow5002.eqsin.wmnet
* 12:51 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl1003.eqiad.wmnet
* 12:50 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow5002.eqsin.wmnet
* 12:48 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 12:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3005.esams.wmnet
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow6001.drmrs.wmnet
* 12:44 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:43 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3005.esams.wmnet
* 12:42 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl1003.eqiad.wmnet
* 12:41 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host wikikube-ctrl1002.eqiad.wmnet
* 12:40 aikochou@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow6001.drmrs.wmnet
* 12:37 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ncredir4002.ulsfo.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts ncredir4001.ulsfo.wmnet
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:37 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:35 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:34 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow7002.magru.wmnet
* 12:32 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:32 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:32 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:28 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1017
* 12:27 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: ncredir4001.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 12:27 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow7002.magru.wmnet
* 12:27 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1017
* 12:25 aikochou@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:25 cgoubert@cumin1003: START - Cookbook sre.hosts.reboot-single for host wikikube-ctrl1002.eqiad.wmnet
* 12:22 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:20 moritzm: failover Ganeti master in esams to ganeti3008
* 12:20 moritzm: failover Ganeti master in esams to ganeti3005
* 12:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:14 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:10 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts ncredir4001.ulsfo.wmnet
* 12:00 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti3006.esams.wmnet
* 12:00 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host ganeti3006.esams.wmnet
* 11:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for druid[1009-1013].eqiad.wmnet
* 11:57 btullis@cumin1003: START - Cookbook sre.hosts.remove-downtime for druid[1009-1013].eqiad.wmnet
* 11:57 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.remove-downtime (exit_code=97) for druid[1009-1013].eqiad.wmnet
* 11:57 btullis@cumin1003: START - Cookbook sre.hosts.remove-downtime for druid[1009-1013].eqiad.wmnet
* 11:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1009.eqiad.wmnet with OS bookworm
* 11:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3006.esams.wmnet
* 11:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1010.eqiad.wmnet with OS bookworm
* 11:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1011.eqiad.wmnet with OS bookworm
* 11:31 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1012.eqiad.wmnet with OS bookworm
* 11:30 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3006.esams.wmnet
* 11:29 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host druid1013.eqiad.wmnet with OS bookworm
* 11:24 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:24 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:22 btullis@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on dse-k8s-worker[1012,1015-1017].eqiad.wmnet with reason: Adding 10 Gbps NIC
* 11:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1009.eqiad.wmnet with reason: host reimage
* 11:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1010.eqiad.wmnet with reason: host reimage
* 11:15 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:14 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:12 mvernon@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe-eqiad
* 11:12 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling reboot on A:swift-fe-codfw
* 11:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1011.eqiad.wmnet with reason: host reimage
* 11:10 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:10 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1012.eqiad.wmnet with reason: host reimage
* 11:07 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit2003.wikimedia.org
* 11:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on druid1013.eqiad.wmnet with reason: host reimage
* 11:06 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 11:06 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-sre: apply
* 11:04 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1010.eqiad.wmnet with reason: host reimage
* 11:02 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1011.eqiad.wmnet with reason: host reimage
* 11:02 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1009.eqiad.wmnet with reason: host reimage
* 11:01 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1012.eqiad.wmnet with reason: host reimage
* 11:00 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit2003.wikimedia.org
* 10:57 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on druid1013.eqiad.wmnet with reason: host reimage
* 10:47 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2010.codfw.wmnet
* 10:47 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1013.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1012.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1011.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1010.eqiad.wmnet with OS bookworm
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host druid1009.eqiad.wmnet with OS bookworm
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3007.esams.wmnet
* 10:39 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2010.codfw.wmnet
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3007.esams.wmnet
* 10:31 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3007.esams.wmnet
* 10:29 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3007.esams.wmnet
* 10:28 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:28 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti3008.esams.wmnet
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti3008.esams.wmnet
* 10:25 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2009.codfw.wmnet
* 10:24 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:24 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:23 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:20 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 10:20 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2009.codfw.wmnet
* 10:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti3008.esams.wmnet
* 10:14 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti3008.esams.wmnet
* 10:09 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:08 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 10:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host sretest2004.codfw.wmnet
* 10:02 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7001.magru.wmnet
* 10:02 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7001.magru.wmnet
* 10:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host sretest2004.codfw.wmnet
* 09:56 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:56 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7001.magru.wmnet
* 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts tcp-proxy4002.ulsfo.wmnet
* 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 09:54 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 09:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: tcp-proxy4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 09:51 elukey@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 09:51 elukey@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 09:51 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7001.magru.wmnet
* 09:51 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:46 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts tcp-proxy4002.ulsfo.wmnet
* 09:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: decom tcp-proxy4001 - jmm@cumin2002"
* 09:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: decom tcp-proxy4001 - jmm@cumin2002"
* 09:43 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm2001.wikimedia.org
* 09:39 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idm2001.wikimedia.org
* 09:38 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:38 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 09:38 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 09:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:35 slyngshede@dns1004: END - running authdns-update
* 09:34 elukey@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 09:34 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:34 elukey@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 09:33 slyngshede@dns1004: START - running authdns-update
* 09:32 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:30 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm1001.wikimedia.org
* 09:26 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 09:26 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idm1001.wikimedia.org
* 09:24 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idm-test1001.wikimedia.org
* 09:22 moritzm: failover Ganeti master in magru to ganeti7004
* 09:21 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts tcp-proxy4001.ulsfo.wmnet
* 09:21 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:20 jmm@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling reboot on A:kafka-test-eqiad
* 09:20 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idm-test1001.wikimedia.org
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host testvm2002.codfw.wmnet
* 09:18 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:15 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM cloudidp2001-dev.codfw.wmnet
* 09:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host testvm2002.codfw.wmnet
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7002.magru.wmnet
* 09:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7002.magru.wmnet
* 09:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host build2001.codfw.wmnet
* 09:13 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts tcp-proxy4001.ulsfo.wmnet
* 09:11 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM cloudidp2001-dev.codfw.wmnet
* 09:09 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp2005.wikimedia.org
* 09:08 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host build2001.codfw.wmnet
* 09:06 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7002.magru.wmnet
* 09:05 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp2005.wikimedia.org
* 09:03 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7002.magru.wmnet
* 09:00 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet
* 08:59 slyngshede@dns1004: END - running authdns-update
* 08:58 slyngshede@dns1004: START - running authdns-update
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7003.magru.wmnet
* 08:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7003.magru.wmnet
* 08:54 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet
* 08:49 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:48 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp1005.wikimedia.org
* 08:48 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:48 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 08:48 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7003.magru.wmnet
* 08:48 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 08:47 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 08:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7003.magru.wmnet
* 08:44 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp1005.wikimedia.org
* 08:44 jmm@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling reboot on A:kafka-test-eqiad
* 08:44 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 08:39 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp-test1005.wikimedia.org
* 08:35 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp-test1005.wikimedia.org
* 08:33 slyngshede@cumin1003: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM idp-test2005.wikimedia.org
* 08:29 slyngshede@cumin1003: START - Cookbook sre.ganeti.reboot-vm for VM idp-test2005.wikimedia.org
* 08:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti7004.magru.wmnet
* 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti7004.magru.wmnet
* 08:22 taavi@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet,service=s3
* 08:18 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251276{{!}}Fix broken survey links on PersonalDashboard (T419950)]] (duration: 32m 09s)
* 08:17 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti7004.magru.wmnet
* 08:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti7004.magru.wmnet
* 08:06 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1004.wikimedia.org
* 08:05 kgraessle@deploy2002: kgraessle: Continuing with sync
* 08:04 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1251276{{!}}Fix broken survey links on PersonalDashboard (T419950)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:59 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1004.wikimedia.org
* 07:52 moritzm: installing Linux 5.10.251 on Bullseye hosts
* 07:45 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1251276{{!}}Fix broken survey links on PersonalDashboard (T419950)]]
* 07:37 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host stewards1001.eqiad.wmnet
* 07:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host stewards1001.eqiad.wmnet
* 07:33 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1003.eqiad.wmnet
* 07:26 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1003.eqiad.wmnet
* 07:25 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aphlict1002.eqiad.wmnet
* 07:21 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host aphlict1002.eqiad.wmnet
* 07:10 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host doc2003.codfw.wmnet
* 07:06 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host doc2003.codfw.wmnet
* 07:02 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists1004.wikimedia.org
* 06:55 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists1004.wikimedia.org
* 05:25 aikochou@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 52s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-15 ==
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 52s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-14 ==
* 14:16 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251941{{!}}CommonSettings: Set class in $wgCentralAuthRC]] (duration: 06m 17s)
* 14:12 reedy@deploy2002: reedy: Continuing with sync
* 14:11 reedy@deploy2002: reedy: Backport for [[gerrit:1251941{{!}}CommonSettings: Set class in $wgCentralAuthRC]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:10 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1251941{{!}}CommonSettings: Set class in $wgCentralAuthRC]]
* 12:51 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251912{{!}}CommonSettings: Specify class in IRC RCFeed setup]] (duration: 06m 19s)
* 12:47 reedy@deploy2002: reedy, lcawte: Continuing with sync
* 12:46 reedy@deploy2002: reedy, lcawte: Backport for [[gerrit:1251912{{!}}CommonSettings: Specify class in IRC RCFeed setup]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:44 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1251912{{!}}CommonSettings: Specify class in IRC RCFeed setup]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 00s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-13 ==
* 22:52 taavi: taavi@deploy2002 ~ $ mwscript CentralAuth:attachAccount.php --wiki=metawiki --userlist backfiller.txt # unify unified Special:CentralAuth/MediaWikiAccountBackfiller on meta
* 20:07 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 20:01 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 20:01 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 19:55 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4052.*
* 19:54 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 19:54 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4052.ulsfo.wmnet with OS trixie
* 19:53 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2005-dev.codfw.wmnet
* 19:46 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2005-dev.codfw.wmnet
* 19:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudservices2004-dev.codfw.wmnet
* 19:40 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4050.*
* 19:40 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudservices2004-dev.codfw.wmnet
* 19:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 19:24 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4052.ulsfo.wmnet with reason: host reimage
* 19:23 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4050.ulsfo.wmnet
* 19:19 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1035.eqiad.wmnet with OS trixie
* 19:19 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs1034.eqiad.wmnet with OS trixie
* 19:18 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:18 jclark@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:18 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:16 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4051.*
* 19:15 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp4050.ulsfo.wmnet
* 19:14 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp4050.ulsfo.wmnet
* 19:13 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:11 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4051.ulsfo.wmnet with OS trixie
* 19:07 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp4050.ulsfo.wmnet
* 19:02 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1035.eqiad.wmnet with reason: host reimage
* 19:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1374.eqiad.wmnet with OS bookworm
* 19:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 19:00 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 18:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs1034.eqiad.wmnet with reason: host reimage
* 18:58 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie
* 18:57 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4052.ulsfo.wmnet with OS trixie
* 18:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1035.eqiad.wmnet with reason: host reimage
* 18:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs1034.eqiad.wmnet with reason: host reimage
* 18:47 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4051.ulsfo.wmnet with reason: host reimage
* 18:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 18:41 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4051.ulsfo.wmnet with reason: host reimage
* 18:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1035.eqiad.wmnet with OS trixie
* 18:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1034.eqiad.wmnet with OS trixie
* 18:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wdqs1033.eqiad.wmnet with OS trixie
* 18:36 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1374.eqiad.wmnet with reason: host reimage
* 18:35 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on cp4050.ulsfo.wmnet with reason: firmware updates
* 18:34 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4052.ulsfo.wmnet with OS trixie
* 18:24 brett@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp4050.ulsfo.wmnet
* 18:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4050.ulsfo.wmnet with OS trixie
* 18:22 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1374.eqiad.wmnet with OS bookworm
* 18:21 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1374.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 18:21 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4051.ulsfo.wmnet with OS trixie
* 18:21 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4051.ulsfo.wmnet with OS trixie
* 18:12 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1373.eqiad.wmnet with OS bookworm
* 18:10 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1374.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 18:10 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:10 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 18:10 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt - jclark@cumin1003"
* 18:10 elukey@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on db1253.eqiad.wmnet with reason: Host went down and paged, depooled
* 18:06 cgoubert@cumin1003: dbctl commit (dc=all): 'Depool db1253', diff saved to https://phabricator.wikimedia.org/P89856 and previous config saved to /var/cache/conftool/dbconfig/20260313-180640-cgoubert.json
* 18:06 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 18:05 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4051.ulsfo.wmnet with OS trixie
* 18:03 elukey: powercycle db1253 - host not reachable via ssh, no events logged in racadm getsel, no console com2 available (blank screen)
* 17:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 17:56 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4050.ulsfo.wmnet with reason: host reimage
* 17:49 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4049.*
* 17:46 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4049.ulsfo.wmnet with OS trixie
* 17:37 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:37 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:36 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 17:35 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4050.ulsfo.wmnet with OS trixie
* 17:35 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:34 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:27 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 17:26 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:26 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 17:26 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 17:20 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4049.ulsfo.wmnet with reason: host reimage
* 17:17 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4050.ulsfo.wmnet with OS trixie
* 17:17 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:16 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:16 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4049.ulsfo.wmnet with reason: host reimage
* 17:12 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1016.eqiad.wmnet
* 17:12 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1016.eqiad.wmnet
* 17:11 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1016.eqiad.wmnet
* 17:11 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4048.*
* 17:10 dhinus: (relogging failed sal) conftool action : set/pooled=no; selector: name=clouddb1016.eqiad.wmnet
* 17:10 dhinus: (relogging failed sal) DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1016.eqiad.wmnet with reason: Rebooting clouddb1016 [[phab:T419960|T419960]]
* 17:09 dhinus: (relogging failed sal) END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1015.eqiad.wmnet
* 17:08 dhinus: (relogging failed sal) START - Cookbook sre.hosts.remove-downtime for clouddb1015.eqiad.wmnet
* 17:08 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 17:07 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 17:07 dhinus: fnegri@cumin1003 conftool action : set/pooled=yes; selector: name=clouddb1015.eqiad.wmnet
* 17:07 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 17:07 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4048.ulsfo.wmnet with OS trixie
* 17:06 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 16:40 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4049.ulsfo.wmnet with OS trixie
* 16:39 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4048.ulsfo.wmnet with reason: host reimage
* 16:36 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1015.eqiad.wmnet
* 16:35 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1015.eqiad.wmnet with reason: Rebooting clouddb1015 [[phab:T419960|T419960]]
* 16:34 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1014.eqiad.wmnet
* 16:34 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1014.eqiad.wmnet
* 16:34 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1014.eqiad.wmnet
* 16:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host apt-staging2001.codfw.wmnet
* 16:28 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudweb1003.wikimedia.org
* 16:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host apt-staging2001.codfw.wmnet
* 16:22 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudweb1003.wikimedia.org
* 16:21 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudweb1004.wikimedia.org
* 16:20 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1014.eqiad.wmnet with reason: Rebooting clouddb1014 [[phab:T419960|T419960]]
* 16:20 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1014.eqiad.wmnet
* 16:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor-dev2001.codfw.wmnet
* 16:19 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie
* 16:18 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4048.ulsfo.wmnet with OS trixie
* 16:16 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudweb1004.wikimedia.org
* 16:15 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor-dev2001.codfw.wmnet
* 16:14 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host debmonitor2003.codfw.wmnet
* 16:10 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host debmonitor2003.codfw.wmnet
* 16:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 16:00 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4048.ulsfo.wmnet with OS trixie
* 15:43 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2010-dev.codfw.wmnet
* 15:38 vgutierrez@cumin1003: END (PASS) - Cookbook sre.loadbalancer.check-ipip (exit_code=0)
* 15:38 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:37 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 15:37 vgutierrez@cumin1003: END (FAIL) - Cookbook sre.loadbalancer.check-ipip (exit_code=99)
* 15:37 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:36 vgutierrez@cumin1003: END (FAIL) - Cookbook sre.loadbalancer.check-ipip (exit_code=99)
* 15:36 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:36 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2010-dev.codfw.wmnet
* 15:35 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet
* 15:35 vgutierrez@cumin1003: END (FAIL) - Cookbook sre.loadbalancer.check-ipip (exit_code=99)
* 15:35 vgutierrez@cumin1003: START - Cookbook sre.loadbalancer.check-ipip
* 15:28 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
* 15:26 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:25 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:23 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:22 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 15:22 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 15:22 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 15:19 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:19 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:19 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:16 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2006-dev.codfw.wmnet
* 15:12 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudidp2001-dev.codfw.wmnet
* 15:08 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudidp2001-dev.codfw.wmnet
* 15:07 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2006-dev.codfw.wmnet
* 14:58 fnegri@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for clouddb1013.eqiad.wmnet
* 14:58 fnegri@cumin1003: START - Cookbook sre.hosts.remove-downtime for clouddb1013.eqiad.wmnet
* 14:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s3
* 14:57 fnegri@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1013.eqiad.wmnet,service=s1
* 14:48 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host backup1015.eqiad.wmnet
* 14:46 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:46 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol2005-dev.codfw.wmnet
* 14:45 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:45 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2003.codfw.wmnet
* 14:44 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wikikube-worker1373.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1015.eqiad.wmnet
* 14:43 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1034.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:42 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:40 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1023
* 14:40 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1023
* 14:40 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1022
* 14:40 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1022
* 14:40 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1021
* 14:39 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup2004.codfw.wmnet
* 14:39 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1021
* 14:38 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudcontrol2005-dev.codfw.wmnet
* 14:37 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1020
* 14:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2003.codfw.wmnet
* 14:36 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:35 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1020
* 14:35 fnegri@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on clouddb1013.eqiad.wmnet with reason: Rebooting clouddb1013 [[phab:T419960|T419960]]
* 14:33 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:33 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:33 fnegri@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1013.eqiad.wmnet
* 14:32 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:32 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1035.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:32 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:32 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1034.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti-test2003.codfw.wmnet
* 14:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wdqs1033.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host wikikube-worker1373.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1020.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:29 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 14:29 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt - jclark@cumin1003"
* 14:29 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update mgmt - jclark@cumin1003"
* 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti-test2002.codfw.wmnet
* 14:28 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2002.codfw.wmnet
* 14:27 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup2004.codfw.wmnet
* 14:27 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cuminunpriv1001.eqiad.wmnet
* 14:25 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup2003.codfw.wmnet
* 14:25 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 14:25 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 14:24 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 14:23 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cuminunpriv1001.eqiad.wmnet
* 14:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2002.codfw.wmnet
* 14:22 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1004.eqiad.wmnet
* 14:22 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti-test2002.codfw.wmnet
* 14:14 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup2003.codfw.wmnet
* 14:13 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup1004.eqiad.wmnet
* 14:09 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1003.eqiad.wmnet
* 14:01 andrew@cumin2002: START - Cookbook sre.hosts.reboot-single for host cloudbackup1003.eqiad.wmnet
* 13:59 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit1003.wikimedia.org
* 13:53 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit1003.wikimedia.org
* 13:49 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host lists2001.wikimedia.org
* 13:48 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts1004.eqiad.wmnet
* 13:46 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host etherpad1004.eqiad.wmnet
* 13:45 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:45 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:44 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts1004.eqiad.wmnet
* 13:42 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host lists2001.wikimedia.org
* 13:42 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host etherpad1004.eqiad.wmnet
* 13:37 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host etherpad2002.codfw.wmnet
* 13:36 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gerrit2002.wikimedia.org
* 13:33 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host etherpad2002.codfw.wmnet
* 13:32 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2003.wikimedia.org
* 13:30 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host gerrit2002.wikimedia.org
* 13:26 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2003.wikimedia.org
* 13:26 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab2002.wikimedia.org
* 13:24 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2020.codfw.wmnet
* 13:23 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2019.codfw.wmnet
* 13:19 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab2002.wikimedia.org
* 13:19 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host gitlab1003.wikimedia.org
* 13:13 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2020.codfw.wmnet
* 13:13 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host gitlab1003.wikimedia.org
* 13:12 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2019.codfw.wmnet
* 13:11 jelto@cumin1003: END (PASS) - Cookbook sre.gitlab.reboot-runner (exit_code=0) rolling reboot on A:gitlab-runner
* 13:05 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2018.codfw.wmnet
* 13:05 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1020.eqiad.wmnet
* 12:54 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2018.codfw.wmnet
* 12:54 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1020.eqiad.wmnet
* 12:54 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2017.codfw.wmnet
* 12:54 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1019.eqiad.wmnet
* 12:53 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:50 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:50 moritzm: powercycle pki1002
* 12:48 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:47 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:44 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:44 mutante: rebooted phab1005 - waiting for it to come back
* 12:44 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2017.codfw.wmnet
* 12:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1019.eqiad.wmnet
* 12:42 bwojtowicz@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:40 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1018.eqiad.wmnet
* 12:39 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2016.codfw.wmnet
* 12:31 jelto@cumin1003: START - Cookbook sre.gitlab.reboot-runner rolling reboot on A:gitlab-runner
* 12:29 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1018.eqiad.wmnet
* 12:29 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1017.eqiad.wmnet
* 12:28 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2016.codfw.wmnet
* 12:27 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup2015.codfw.wmnet
* 12:24 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host bast1004.wikimedia.org
* 12:18 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host doc1004.eqiad.wmnet
* 12:18 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1017.eqiad.wmnet
* 12:17 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:17 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:15 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup2015.codfw.wmnet
* 12:15 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:15 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:14 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host doc1004.eqiad.wmnet
* 12:13 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host aphlict2001.codfw.wmnet
* 12:10 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host aphlict2001.codfw.wmnet
* 12:10 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 6 hosts with reason: reboot
* 12:10 aokoth@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host vrts2002.codfw.wmnet
* 12:07 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:07 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:03 aokoth@cumin1003: START - Cookbook sre.hosts.reboot-single for host vrts2002.codfw.wmnet
* 12:02 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:02 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:01 jynus@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host backup1016.eqiad.wmnet
* 12:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1019.eqiad.wmnet
* 11:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1018.eqiad.wmnet
* 11:59 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:59 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1019.eqiad.wmnet
* 11:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1018.eqiad.wmnet
* 11:51 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 11:51 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 11:50 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host backup1016.eqiad.wmnet
* 11:49 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup2004.codfw.wmnet
* 11:43 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup2004.codfw.wmnet
* 11:43 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup1004.eqiad.wmnet
* 11:37 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup1004.eqiad.wmnet
* 11:36 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup2003.codfw.wmnet
* 11:34 jynus@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-backup1003.eqiad.wmnet
* 11:32 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 11:32 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:30 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup2003.codfw.wmnet
* 11:28 jynus@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-backup1003.eqiad.wmnet
* 11:27 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:26 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy1001.eqiad.wmnet
* 11:21 arnaudb@cumin1003: END (ERROR) - Cookbook sre.hosts.reboot-single (exit_code=97) for host contint1003.wikimedia.org
* 11:21 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy1001.eqiad.wmnet
* 11:21 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy1002.eqiad.wmnet
* 11:16 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy1002.eqiad.wmnet
* 11:16 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy2001.codfw.wmnet
* 11:16 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host contint1003.wikimedia.org
* 11:12 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-master-codfw
* 11:12 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul1001.eqiad.wmnet
* 11:11 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy2001.codfw.wmnet
* 11:11 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy2002.codfw.wmnet
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1018.eqiad.wmnet with reason: host reimage
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:09 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-master-eqiad
* 11:08 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 11:08 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul1001.eqiad.wmnet
* 11:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudcontrol1008-dev.eqiad.wmnet
* 11:07 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy2002.codfw.wmnet
* 11:06 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy3001.esams.wmnet
* 11:05 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1018.eqiad.wmnet with reason: host reimage
* 11:01 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy3001.esams.wmnet
* 11:01 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudcontrol1008-dev.eqiad.wmnet
* 11:01 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1002-dev.eqiad.wmnet
* 11:01 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy3002.esams.wmnet
* 10:59 mvernon@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 22:00:00 on db1258.eqiad.wmnet with reason: depooled, likely to flap over the weekend
* 10:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudbackup1002-dev.eqiad.wmnet
* 10:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudbackup1001-dev.eqiad.wmnet
* 10:56 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy3002.esams.wmnet
* 10:56 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-master-codfw
* 10:55 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4001.ulsfo.wmnet
* 10:55 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-worker-codfw
* 10:54 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudbackup1001-dev.eqiad.wmnet
* 10:52 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-master-eqiad
* 10:50 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.reboot-nodes (exit_code=0) rolling reboot on A:wikikube-staging-worker-eqiad
* 10:50 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 10:50 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4001.ulsfo.wmnet
* 10:50 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4002.ulsfo.wmnet
* 10:50 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1019.eqiad.wmnet with reason: host reimage
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1019.eqiad.wmnet with reason: host reimage
* 10:45 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4002.ulsfo.wmnet
* 10:45 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy5001.eqsin.wmnet
* 10:40 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy5001.eqsin.wmnet
* 10:39 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy5002.eqsin.wmnet
* 10:37 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul2001.codfw.wmnet
* 10:37 ladsgroup@cumin1003: dbctl commit (dc=all): 'Depool', diff saved to https://phabricator.wikimedia.org/P89852 and previous config saved to /var/cache/conftool/dbconfig/20260313-103719-ladsgroup.json
* 10:33 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul2001.codfw.wmnet
* 10:32 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy5002.eqsin.wmnet
* 10:31 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cloudweb2002-dev.wikimedia.org
* 10:31 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul1002.eqiad.wmnet
* 10:31 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy6001.drmrs.wmnet
* 10:29 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:29 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:28 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:28 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 10:27 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 10:27 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:27 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul1002.eqiad.wmnet
* 10:27 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:26 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy6001.drmrs.wmnet
* 10:24 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host cloudweb2002-dev.wikimedia.org
* 10:23 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy6002.drmrs.wmnet
* 10:22 arnaudb@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host zuul2002.codfw.wmnet
* 10:19 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1008.eqiad.wmnet
* 10:18 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul2002.codfw.wmnet
* 10:18 arnaudb@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host zuul2002.codfw.wmnet
* 10:18 arnaudb@cumin1003: START - Cookbook sre.hosts.reboot-single for host zuul2002.codfw.wmnet
* 10:18 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy6002.drmrs.wmnet
* 10:16 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy7002.magru.wmnet
* 10:16 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-worker-eqiad
* 10:15 jayme@cumin1003: START - Cookbook sre.k8s.reboot-nodes rolling reboot on A:wikikube-staging-worker-codfw
* 10:13 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1008.eqiad.wmnet
* 10:13 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1007.eqiad.wmnet
* 10:12 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy7002.magru.wmnet
* 10:09 jelto@cumin1003: conftool action : set/pooled=yes; selector: name=tcp-proxy7001.magru.wmnet
* 10:08 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1007.eqiad.wmnet
* 10:08 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1006.eqiad.wmnet
* 10:07 jelto@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy7001.magru.wmnet
* 10:03 jelto@cumin1003: START - Cookbook sre.hosts.reboot-single for host tcp-proxy7001.magru.wmnet
* 10:02 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1006.eqiad.wmnet
* 10:02 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1005.eqiad.wmnet
* 10:01 jelto@cumin1003: conftool action : set/pooled=no; selector: name=tcp-proxy7001.magru.wmnet
* 09:58 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 09:57 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1005.eqiad.wmnet
* 09:57 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1004.eqiad.wmnet
* 09:51 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1004.eqiad.wmnet
* 09:51 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1003.eqiad.wmnet
* 09:50 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:50 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 09:46 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1003.eqiad.wmnet
* 09:46 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1002.eqiad.wmnet
* 09:41 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1002.eqiad.wmnet
* 09:40 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-worker1001.eqiad.wmnet
* 09:39 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:39 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:35 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-worker1001.eqiad.wmnet
* 09:35 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-ctrl1002.eqiad.wmnet
* 09:34 filippo@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tools-k8s-ctrl1001.eqiad.wmnet
* 09:34 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:33 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:32 moritzm: installing Linux 6.1.164 on Bookworm hosts
* 09:30 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-ctrl1002.eqiad.wmnet
* 09:28 filippo@cumin1003: START - Cookbook sre.hosts.reboot-single for host tools-k8s-ctrl1001.eqiad.wmnet
* 09:01 elukey@cumin1003: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 08:37 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 07:56 moritzm: installing Linux 6.12.74 on Trixie hosts
* 07:55 moritzm: installing 6.12.74 on Trixie hosts
* 02:57 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4044.ulsfo.wmnet [reason: trixie reimaging]
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 18s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:41 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4044.ulsfo.wmnet with OS trixie
* 01:37 mutante: contint1003/contint2003 - every time(?) we setup machines with puppet using our httpd module and PHP - and puppet runs for the first time we run into the same old issue with "Exec[ensure_present_mod_php" failing and "Considering conflict mpm_worker for mpm_prefork"sudo a2dismod mpm_event". The fix is: 'sudo a2dismod mpm_event' and run puppet again. [[phab:T418521|T418521]]
* 01:26 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on contint1003.wikimedia.org with reason: [[phab:T418521|T418521]]
* 01:26 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on contint2003.wikimedia.org with reason: [[phab:T418521|T418521]]
* 01:23 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on contint2003.wikimedia.org with reason: setup
* 01:22 dzahn@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on contint1003.wikimedia.org with reason: setup
* 01:22 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4047.*
* 01:09 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 01:08 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4043.ulsfo.wmnet [reason: trixie reimaging]
* 01:06 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4044.ulsfo.wmnet with reason: host reimage
* 01:05 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4043.ulsfo.wmnet with OS trixie
* 00:55 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4047.ulsfo.wmnet with OS trixie
* 00:45 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4044.ulsfo.wmnet with OS trixie
* 00:45 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4044.ulsfo.wmnet [reason: trixie reimaging]
* 00:42 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4042.ulsfo.wmnet [reason: trixie reimaging]
* 00:41 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4042.ulsfo.wmnet with OS trixie
* 00:39 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4043.ulsfo.wmnet with reason: host reimage
* 00:31 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4043.ulsfo.wmnet with reason: host reimage
* 00:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4047.ulsfo.wmnet with reason: host reimage
* 00:27 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1251187 [[phab:T419637|T419637]] (duration: 07m 12s)
* 00:23 rzl@deploy2002: rzl: Continuing with sync
* 00:23 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4047.ulsfo.wmnet with reason: host reimage
* 00:22 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1251187 [[phab:T419637|T419637]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:21 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1251187 [[phab:T419637|T419637]]
* 00:15 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 00:14 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4040.ulsfo.wmnet [reason: trixie reimaging]
* 00:11 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4042.ulsfo.wmnet with reason: host reimage
* 00:11 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4043.ulsfo.wmnet with OS trixie
* 00:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4040.ulsfo.wmnet with OS trixie
* 00:04 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4047.ulsfo.wmnet with OS trixie
* 00:03 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp4047.ulsfo.wmnet with OS trixie
* 00:03 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4043.ulsfo.wmnet with OS trixie
== 2026-03-12 ==
* 23:57 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host o11ytest1001.eqiad.wmnet with OS trixie
* 23:53 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 23:53 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 23:50 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie
* 23:49 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 23:49 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 23:45 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 23:45 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 23:45 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 23:44 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4042.ulsfo.wmnet with OS trixie
* 23:41 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4040.ulsfo.wmnet with reason: host reimage
* 23:41 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 23:41 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 23:40 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on o11ytest1001.eqiad.wmnet with reason: host reimage
* 23:36 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 23:36 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on o11ytest1001.eqiad.wmnet with reason: host reimage
* 23:36 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 23:35 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 23:35 rzl@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 23:22 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host o11ytest1001
* 23:22 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest1001
* 23:21 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie
* 23:19 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4040.ulsfo.wmnet with OS trixie
* 23:18 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest1001
* 23:18 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest1001.eqiad.wmnet 141.32.64.10.in-addr.arpa 1.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 23:18 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest1001.eqiad.wmnet 141.32.64.10.in-addr.arpa 1.4.1.0.2.3.0.0.4.6.0.0.0.1.0.0.3.0.1.0.1.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 23:18 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 23:18 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest1001 - herron@cumin1003"
* 23:18 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest1001 - herron@cumin1003"
* 23:04 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4047.ulsfo.wmnet with OS trixie
* 23:00 herron@cumin1003: START - Cookbook sre.dns.netbox
* 23:00 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host o11ytest1001
* 22:59 herron@cumin1003: START - Cookbook sre.hosts.reimage for host o11ytest1001.eqiad.wmnet with OS trixie
* 22:58 herron@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mwlog1002 to o11ytest1001
* 22:57 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest1001
* 22:55 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest1001
* 22:55 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest1001 on all recursors
* 22:55 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest1001 on all recursors
* 22:55 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:55 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog1002 to o11ytest1001 - herron@cumin1003"
* 22:54 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog1002 to o11ytest1001 - herron@cumin1003"
* 22:51 herron@cumin1003: START - Cookbook sre.dns.netbox
* 22:50 herron@cumin1003: START - Cookbook sre.hosts.rename from mwlog1002 to o11ytest1001
* 22:42 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4043.ulsfo.wmnet with OS trixie
* 22:42 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4043.ulsfo.wmnet [reason: trixie reimaging]
* 22:41 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4041.ulsfo.wmnet [reason: trixie reimaging]
* 22:39 bvibber@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251190{{!}}Enable ReaderExperiments Share Highlight subfeature for metrics (T416945)]], [[gerrit:1251195{{!}}Metrics module for share highlight experiment baseline (T416945)]] (duration: 06m 49s)
* 22:38 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4041.ulsfo.wmnet with OS trixie
* 22:35 bvibber@deploy2002: bvibber: Continuing with sync
* 22:34 bvibber@deploy2002: bvibber: Backport for [[gerrit:1251190{{!}}Enable ReaderExperiments Share Highlight subfeature for metrics (T416945)]], [[gerrit:1251195{{!}}Metrics module for share highlight experiment baseline (T416945)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:32 bvibber@deploy2002: Started scap sync-world: Backport for [[gerrit:1251190{{!}}Enable ReaderExperiments Share Highlight subfeature for metrics (T416945)]], [[gerrit:1251195{{!}}Metrics module for share highlight experiment baseline (T416945)]]
* 22:28 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1251182 [[phab:T419637|T419637]] (duration: 11m 18s)
* 22:27 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host o11ytest2001.codfw.wmnet with OS trixie
* 22:26 rzl@deploy2002: rzl: Continuing with sync
* 22:24 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1251182 [[phab:T419637|T419637]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:23 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4042.ulsfo.wmnet with OS trixie
* 22:23 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4042.ulsfo.wmnet [reason: trixie reimaging]
* 22:20 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4046.*
* 22:17 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1251182 [[phab:T419637|T419637]]
* 22:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4041.ulsfo.wmnet with reason: host reimage
* 22:09 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on o11ytest2001.codfw.wmnet with reason: host reimage
* 22:08 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4041.ulsfo.wmnet with reason: host reimage
* 22:03 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on o11ytest2001.codfw.wmnet with reason: host reimage
* 22:01 jasmine@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:58 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4040.ulsfo.wmnet with OS trixie
* 21:58 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4040.ulsfo.wmnet [reason: trixie reimaging]
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host o11ytest2001
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest2001
* 21:45 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest2001
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest2001.codfw.wmnet 9.32.192.10.in-addr.arpa 9.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:45 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest2001.codfw.wmnet 9.32.192.10.in-addr.arpa 9.0.0.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:45 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest2001 - herron@cumin1003"
* 21:45 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host o11ytest2001 - herron@cumin1003"
* 21:43 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4041.ulsfo.wmnet with OS trixie
* 21:41 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4038.ulsfo.wmnet [reason: trixie reimaging]
* 21:40 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4038.ulsfo.wmnet with OS trixie
* 21:39 herron@cumin1003: START - Cookbook sre.dns.netbox
* 21:39 herron@cumin1003: START - Cookbook sre.hosts.move-vlan for host o11ytest2001
* 21:39 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2006.codfw.wmnet with OS trixie
* 21:39 herron@cumin1003: START - Cookbook sre.hosts.reimage for host o11ytest2001.codfw.wmnet with OS trixie
* 21:36 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 21:35 herron@cumin1003: END (PASS) - Cookbook sre.hosts.rename (exit_code=0) from mwlog2002 to o11ytest2001
* 21:35 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 21:35 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 21:35 herron@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host o11ytest2001
* 21:34 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 21:34 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 21:33 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 21:32 herron@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host o11ytest2001
* 21:32 herron@cumin1003: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) o11ytest2001 on all recursors
* 21:32 herron@cumin1003: START - Cookbook sre.dns.wipe-cache o11ytest2001 on all recursors
* 21:32 herron@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 21:32 herron@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog2002 to o11ytest2001 - herron@cumin1003"
* 21:31 herron@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Renaming mwlog2002 to o11ytest2001 - herron@cumin1003"
* 21:28 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4046.ulsfo.wmnet with OS trixie
* 21:27 herron@cumin1003: START - Cookbook sre.dns.netbox
* 21:26 herron@cumin1003: START - Cookbook sre.hosts.rename from mwlog2002 to o11ytest2001
* 21:20 rzl: rzl@apt1002:~$ sudo -i reprepro copy trixie-wikimedia bullseye-wikimedia envoyproxy
* 21:20 rzl: rzl@apt1002:~$ sudo -i reprepro copy bookworm-wikimedia bullseye-wikimedia envoyproxy
* 21:20 rzl: rzl@apt1002:~$ sudo -i reprepro -C main includedeb bullseye-wikimedia /srv/wikimedia/pool/component/envoy-future/e/envoyproxy/envoyproxy_1.35.9-1_amd64.deb
* 21:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 21:13 cscott@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251173{{!}}Revert "Move post-processing of flaggedrevs views inside FlaggablePageView"]] (duration: 07m 28s)
* 21:09 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4038.ulsfo.wmnet with reason: host reimage
* 21:09 cscott@deploy2002: cscott: Continuing with sync
* 21:07 cscott@deploy2002: cscott: Backport for [[gerrit:1251173{{!}}Revert "Move post-processing of flaggedrevs views inside FlaggablePageView"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:05 cscott@deploy2002: Started scap sync-world: Backport for [[gerrit:1251173{{!}}Revert "Move post-processing of flaggedrevs views inside FlaggablePageView"]]
* 21:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 21:02 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251152{{!}}Use 'alwaysShowLogin' query parameter during login (T419723)]], [[gerrit:1251150{{!}}login: Add 'alwaysShowLogin' login URL parameter (T419723)]], [[gerrit:1251168{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1250750{{!}}Enable parser survey for opted-out users on ru/pt/ja/id wikis (T414852)]] (duration: 10m 41s)
* 20:58 tgr@deploy2002: tgr, jsn, cscott: Continuing with sync
* 20:58 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4046.ulsfo.wmnet with reason: host reimage
* 20:54 tgr@deploy2002: tgr, jsn, cscott: Backport for [[gerrit:1251152{{!}}Use 'alwaysShowLogin' query parameter during login (T419723)]], [[gerrit:1251150{{!}}login: Add 'alwaysShowLogin' login URL parameter (T419723)]], [[gerrit:1251168{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1250750{{!}}Enable parser survey for opted-out users on ru/pt/ja/id wikis (T414852)]] synced to the testservers (see https://wikitech
* 20:52 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1251152{{!}}Use 'alwaysShowLogin' query parameter during login (T419723)]], [[gerrit:1251150{{!}}login: Add 'alwaysShowLogin' login URL parameter (T419723)]], [[gerrit:1251168{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1250750{{!}}Enable parser survey for opted-out users on ru/pt/ja/id wikis (T414852)]]
* 20:49 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 20:43 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251087{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251088{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251106{{!}}phpunit: Avoid unnecessary writes in generatePHPUnitConfig.php (T419107)]] (duration: 07m 37s)
* 20:39 tgr@deploy2002: tgr, daimona: Continuing with sync
* 20:37 tgr@deploy2002: tgr, daimona: Backport for [[gerrit:1251087{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251088{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251106{{!}}phpunit: Avoid unnecessary writes in generatePHPUnitConfig.php (T419107)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:37 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4046.ulsfo.wmnet with OS trixie
* 20:35 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1251087{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251088{{!}}Set 'sub' JWT field in client credentials access tokens (T417278)]], [[gerrit:1251106{{!}}phpunit: Avoid unnecessary writes in generatePHPUnitConfig.php (T419107)]]
* 20:35 jsn@deploy2002: Synchronized wmf-config/throttle.php: (no justification provided) (duration: 01m 57s)
* 20:32 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp4045.*
* 20:28 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4041.ulsfo.wmnet with OS trixie
* 20:20 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 20:18 jsn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249364{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1251140{{!}}[arwikiquote] add namespace alias for NS_PROJECT (T419828)]], [[gerrit:1251098{{!}}Deploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1251164{{!}}Increase IP cap limit for azwiki (T419899)]] (duration: 11m 11s)
* 20:14 jsn@deploy2002: jsn, dani, nmw03, gergesshamon: Continuing with sync
* 20:09 jsn@deploy2002: jsn, dani, nmw03, gergesshamon: Backport for [[gerrit:1249364{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1251140{{!}}[arwikiquote] add namespace alias for NS_PROJECT (T419828)]], [[gerrit:1251098{{!}}Deploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1251164{{!}}Increase IP cap limit for azwiki (T419899)]] synced to the testservers (see https://wikitech.wikimedia.org/wik
* 20:07 jsn@deploy2002: Started scap sync-world: Backport for [[gerrit:1249364{{!}}PersonalDashboard: enable CTA for pilot wikis (T418613)]], [[gerrit:1251140{{!}}[arwikiquote] add namespace alias for NS_PROJECT (T419828)]], [[gerrit:1251098{{!}}Deploy participant recruitment survey on frwiki (T419778)]], [[gerrit:1251164{{!}}Increase IP cap limit for azwiki (T419899)]]
* 19:21 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mathoid: apply
* 19:21 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mathoid: apply
* 19:20 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mathoid: apply
* 19:19 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mathoid: apply
* 19:16 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 19:16 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 19:15 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 19:14 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 19:13 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 19:12 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 19:12 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 19:11 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 19:07 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4041.ulsfo.wmnet with OS trixie
* 19:06 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4041.ulsfo.wmnet [reason: trixie reimaging]
* 19:06 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4039.ulsfo.wmnet [reason: trixie reimaging]
* 19:06 swfrench-wmf: reprepro include xdebug_3.4.4-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include wikidiff2_1.14.1-2+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-yaml_2.2.4-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-xhprof_2.3.10-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-wmerrors_2.0.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 swfrench-wmf: reprepro include php-uuid_1.3.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:05 brennen@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251138{{!}}EditPage: Re-add catch block for MWException (T419883)]] (duration: 09m 46s)
* 19:04 swfrench-wmf: reprepro include php-redis_6.2.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:04 swfrench-wmf: reprepro include php-pcov_1.0.12-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:04 swfrench-wmf: reprepro include php-memcached_3.3.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:04 swfrench-wmf: reprepro include php-luasandbox_4.1.2-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:03 swfrench-wmf: reprepro include php-imagick_3.7.0-13+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:03 swfrench-wmf: reprepro include php-excimer_1.2.5-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 19:01 brennen@deploy2002: somerandomdeveloper, brennen: Continuing with sync
* 18:59 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1019.eqiad.wmnet with OS bookworm
* 18:57 brennen@deploy2002: somerandomdeveloper, brennen: Backport for [[gerrit:1251138{{!}}EditPage: Re-add catch block for MWException (T419883)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:55 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4039.ulsfo.wmnet with OS trixie
* 18:55 brennen@deploy2002: Started scap sync-world: Backport for [[gerrit:1251138{{!}}EditPage: Re-add catch block for MWException (T419883)]]
* 18:52 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/mathoid: apply
* 18:52 rzl@deploy2002: helmfile [staging] START helmfile.d/services/mathoid: apply
* 18:42 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp20(2[789]{{!}}3[0-9]{{!}}40).*,service=ats-be
* 18:34 brennen@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 18:29 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4039.ulsfo.wmnet with reason: host reimage
* 18:29 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:28 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Updating dse-k8s-worker1019 - btullis@cumin1003"
* 18:26 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2332.codfw.wmnet
* 18:26 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2332.codfw.wmnet
* 18:25 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Updating dse-k8s-worker1019 - btullis@cumin1003"
* 18:24 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4039.ulsfo.wmnet with reason: host reimage
* 18:23 brennen@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251139{{!}}Ensure that we always run ParserHooks::transformHtml() when using Parsoid (T419830)]] (duration: 14m 46s)
* 18:21 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 18:20 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4038.ulsfo.wmnet with OS trixie
* 18:19 brennen@deploy2002: cscott, brennen: Continuing with sync
* 18:18 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 18:16 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4045.ulsfo.wmnet with OS trixie
* 18:10 brennen@deploy2002: cscott, brennen: Backport for [[gerrit:1251139{{!}}Ensure that we always run ParserHooks::transformHtml() when using Parsoid (T419830)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:10 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1019.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:08 brennen@deploy2002: Started scap sync-world: Backport for [[gerrit:1251139{{!}}Ensure that we always run ParserHooks::transformHtml() when using Parsoid (T419830)]]
* 18:02 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1019.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:02 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 17:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4039.ulsfo.wmnet with OS trixie
* 17:58 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1019
* 17:58 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1019
* 17:56 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4039.ulsfo.wmnet [reason: trixie reimaging]
* 17:55 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp20(3[6-9]{{!}}4[012]).*
* 17:54 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:53 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp4037.ulsfo.wmnet [reason: trixie reimaging]
* 17:52 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4045.ulsfo.wmnet with reason: host reimage
* 17:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp4037.ulsfo.wmnet with OS trixie
* 17:49 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4045.ulsfo.wmnet with reason: host reimage
* 17:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:33 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:31 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1018
* 17:31 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1018
* 17:30 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:28 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS trixie
* 17:28 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cp4045.ulsfo.wmnet with OS trixie
* 17:27 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp203[0-5].*
* 17:24 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp4037.ulsfo.wmnet with reason: host reimage
* 17:20 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:20 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 17:18 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp4037.ulsfo.wmnet with reason: host reimage
* 17:17 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-backup1004.eqiad.wmnet with OS trixie
* 17:17 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 17:16 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 17:06 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp202[89].*
* 17:03 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp2027.*
* 16:59 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4038.ulsfo.wmnet with OS trixie
* 16:58 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4038.ulsfo.wmnet [reason: trixie reimaging]
* 16:58 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup1004.eqiad.wmnet with reason: host reimage
* 16:58 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:57 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp4037.ulsfo.wmnet with OS trixie
* 16:57 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp4037.ulsfo.wmnet [reason: trixie reimaging]
* 16:55 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup1004.eqiad.wmnet with reason: host reimage
* 16:50 swfrench-wmf: reprepro include php-apcu_5.1.24-1+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:45 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 16:45 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 16:44 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 16:43 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 16:43 swfrench-wmf: reprepro include dh-php_5.5+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:42 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1018.eqiad.wmnet with OS bookworm
* 16:41 swfrench-wmf: reprepro include php-defaults_94+wmf11u1+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:37 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-backup1004.eqiad.wmnet with OS trixie
* 16:36 swfrench-wmf: reprepro include php8.3_8.3.30-1+wmf11u2+icu72u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 16:27 dzahn@dns1004: END - running authdns-update
* 16:26 dzahn@dns1004: START - running authdns-update
* 16:25 mutante: switching old status.wikimedia.org page away from rackspace [[phab:T414098|T414098]]
* 16:25 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp4045.ulsfo.wmnet with OS trixie
* 16:20 dzahn@deploy2002: helmfile [eqiad] DONE helmfile.d/services/miscweb: apply
* 16:20 dzahn@deploy2002: helmfile [eqiad] START helmfile.d/services/miscweb: apply
* 16:19 dzahn@deploy2002: helmfile [codfw] DONE helmfile.d/services/miscweb: apply
* 16:19 dzahn@deploy2002: helmfile [codfw] START helmfile.d/services/miscweb: apply
* 16:12 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 16:11 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 16:10 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 16:09 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 16:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 16:09 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 16:08 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 16:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 16:07 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 16:06 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 16:05 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 16:04 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 16:04 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 16:04 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 16:03 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 16:02 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 16:02 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 16:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 16:01 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-fr-tech: apply
* 15:58 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 15:57 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 15:57 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 15:56 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 15:52 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts cloudgw2002-dev.codfw.wmnet
* 15:52 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 15:52 andrew@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2002-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 15:47 andrew@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: cloudgw2002-dev.codfw.wmnet decommissioned, removing all IPs except the asset tag one - andrew@cumin2002"
* 15:43 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 15:41 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:40 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:36 andrew@cumin2002: START - Cookbook sre.hosts.decommission for hosts cloudgw2002-dev.codfw.wmnet
* 15:35 joal@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 15:33 joal@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 15:27 ebernhardson@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:26 ebernhardson@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:19 moritzm: reuploadd libxml2 2.9.10+dfsg-6.7+deb11u9+wmf11u1 and 72.1-3+deb12u1~wmf11u1 to component/php83-icu72 for bullseye-wikimedia [[phab:T419058|T419058]]
* 15:14 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:13 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:13 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=tcp-proxy4004.ulsfo.wmnet
* 15:13 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=tcp-proxy4004.ulsfo.wmnet
* 15:12 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=tcp-proxy4003.ulsfo.wmnet
* 15:12 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=tcp-proxy4003.ulsfo.wmnet
* 15:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 15:00 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:57 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:56 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:45 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:44 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:34 andrew@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 14:33 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1018.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:31 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 14:31 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1018
* 14:31 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1018
* 14:25 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:24 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:20 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=clouddb1020.eqiad.wmnet
* 14:15 ayounsi@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on 24 hosts with reason: Switch BGP bounce
* 14:12 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=clouddb1020.eqiad.wmnet
* 14:09 mlitn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251034{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251035{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251036{{!}}Remove queueing logic (T419587)]], [[gerrit:1251037{{!}}Remove queueing logic (T419587)]] (duration: 07m 15s)
* 14:08 andrew@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 14:07 akhatun@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:07 akhatun@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-edit-type-enrich-next: apply
* 14:05 mlitn@deploy2002: mlitn: Continuing with sync
* 14:04 mlitn@deploy2002: mlitn: Backport for [[gerrit:1251034{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251035{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251036{{!}}Remove queueing logic (T419587)]], [[gerrit:1251037{{!}}Remove queueing logic (T419587)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:03 XioNoX: start eqiad rack D2 depools
* 14:02 mlitn@deploy2002: Started scap sync-world: Backport for [[gerrit:1251034{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251035{{!}}Update CSS selector for Mobile TOC button (T419587)]], [[gerrit:1251036{{!}}Remove queueing logic (T419587)]], [[gerrit:1251037{{!}}Remove queueing logic (T419587)]]
* 13:59 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:59 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:57 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:54 moritzm: installing libssh security updates
* 13:54 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:45 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251031{{!}}ext.testKitchen: Depend on mediawiki.user module]], [[gerrit:1251048{{!}}Add title to the request context in FlaggedRevsCacheTest (T419539)]], [[gerrit:1251032{{!}}ext.testKitchen: Depend on mediawiki.user module]] (duration: 08m 01s)
* 13:42 phuedx@deploy2002: phuedx: Continuing with sync
* 13:39 phuedx@deploy2002: phuedx: Backport for [[gerrit:1251031{{!}}ext.testKitchen: Depend on mediawiki.user module]], [[gerrit:1251048{{!}}Add title to the request context in FlaggedRevsCacheTest (T419539)]], [[gerrit:1251032{{!}}ext.testKitchen: Depend on mediawiki.user module]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:37 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1251031{{!}}ext.testKitchen: Depend on mediawiki.user module]], [[gerrit:1251048{{!}}Add title to the request context in FlaggedRevsCacheTest (T419539)]], [[gerrit:1251032{{!}}ext.testKitchen: Depend on mediawiki.user module]]
* 13:26 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1251005{{!}}Deploy EditCheck suggestion mode at all Wikipedias (T415320)]] (duration: 06m 42s)
* 13:24 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:22 esanders@deploy2002: esanders: Continuing with sync
* 13:22 esanders@deploy2002: esanders: Backport for [[gerrit:1251005{{!}}Deploy EditCheck suggestion mode at all Wikipedias (T415320)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1178.eqiad.wmnet
* 13:21 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:20 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1251005{{!}}Deploy EditCheck suggestion mode at all Wikipedias (T415320)]]
* 13:18 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250656{{!}}Add multilingual revert risk host header for LiftWing requests (T419718)]] (duration: 10m 52s)
* 13:14 fnegri@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.add-wiki (exit_code=99) for database kaiwiki ([[phab:T414240|T414240]])
* 13:14 fnegri@cumin1003: START - Cookbook sre.wikireplicas.add-wiki for database kaiwiki ([[phab:T414240|T414240]])
* 13:14 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 13:14 kgraessle@deploy2002: kgraessle: Continuing with sync
* 13:12 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 13:11 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1250656{{!}}Add multilingual revert risk host header for LiftWing requests (T419718)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:10 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 13:07 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1250656{{!}}Add multilingual revert risk host header for LiftWing requests (T419718)]]
* 13:05 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1159.eqiad.wmnet
* 13:03 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 13:02 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 13:02 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:02 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:57 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1159.eqiad.wmnet
* 12:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1013.eqiad.wmnet
* 12:49 dpogorzelski@deploy2002: helmfile [staging] DONE helmfile.d/services/changeprop: sync
* 12:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1013.eqiad.wmnet
* 12:49 dpogorzelski@deploy2002: helmfile [staging] START helmfile.d/services/changeprop: sync
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 12:33 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:31 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 12:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum4004.ulsfo.wmnet
* 12:29 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4004.ulsfo.wmnet
* 12:28 moritzm: installing postgresql-17 security updates
* 12:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4004.ulsfo.wmnet
* 12:14 moritzm: installing wireshark security updates
* 12:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1013.eqiad.wmnet with reason: host reimage
* 12:07 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1013.eqiad.wmnet with reason: host reimage
* 11:52 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:51 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:51 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:50 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:50 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host tcp-proxy4004.ulsfo.wmnet
* 11:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy4004.ulsfo.wmnet with OS trixie
* 11:49 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1013.eqiad.wmnet with OS bookworm
* 11:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy4004.ulsfo.wmnet with reason: host reimage
* 11:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy4004.ulsfo.wmnet with reason: host reimage
* 11:19 jayme: disabled puppet on all wikikube worker nodes to rollout/test new apparmor profiles in staging - [[phab:T419781|T419781]]
* 11:07 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy4004.ulsfo.wmnet with OS trixie
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 11:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy4004.ulsfo.wmnet on all recursors
* 11:06 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy4004.ulsfo.wmnet on all recursors
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:06 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 11:03 vriley@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:00 vriley@cumin1003: START - Cookbook sre.dns.netbox
* 10:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4004.ulsfo.wmnet - jmm@cumin2002"
* 10:42 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device asw1-23-ulsfo
* 10:41 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 10:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1013.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 10:33 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host tcp-proxy4003.ulsfo.wmnet
* 10:31 vgutierrez@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4001.ulsfo.wmnet
* 10:31 vgutierrez@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4002.ulsfo.wmnet
* 10:31 vgutierrez@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4004.ulsfo.wmnet
* 10:30 vgutierrez@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4003.ulsfo.wmnet
* 10:30 vgutierrez: repooling ncredir4003 & ncredir4004
* 10:29 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host tcp-proxy4003.ulsfo.wmnet
* 10:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy4004.ulsfo.wmnet
* 10:26 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1013.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 10:26 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 10:25 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1013
* 10:22 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1013
* 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host tcp-proxy4003.ulsfo.wmnet
* 10:20 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host tcp-proxy4003.ulsfo.wmnet with OS trixie
* 10:12 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 10:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1011.eqiad.wmnet
* 10:12 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 10:11 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:11 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:10 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 10:09 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 10:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1011.eqiad.wmnet
* 10:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1010.eqiad.wmnet
* 10:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on tcp-proxy4003.ulsfo.wmnet with reason: host reimage
* 10:00 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1010.eqiad.wmnet
* 09:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on tcp-proxy4003.ulsfo.wmnet with reason: host reimage
* 09:48 trueg@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/services/SERVICE_NAME: apply
* 09:48 trueg@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/services/SERVICE_NAME: apply
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2024.codfw.wmnet
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2023.codfw.wmnet
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2022.codfw.wmnet
* 09:40 mvernon@cumin2002: conftool action : set/pooled=yes; selector: name=ms-fe2021.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2024.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2023.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2022.codfw.wmnet
* 09:39 mvernon@cumin2002: conftool action : set/weight=40; selector: name=ms-fe2021.codfw.wmnet
* 09:39 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 09:39 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Post reimage - btullis@cumin1003"
* 09:39 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 09:39 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 09:38 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 09:38 btullis@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 09:35 mvernon@cumin2002: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>ms-fe[2009-2020].codfw.wmnet<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 09:32 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host durum4004.ulsfo.wmnet
* 09:32 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 09:32 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host tcp-proxy4003.ulsfo.wmnet with OS trixie
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) tcp-proxy4003.ulsfo.wmnet on all recursors
* 09:30 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache tcp-proxy4003.ulsfo.wmnet on all recursors
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM tcp-proxy4003.ulsfo.wmnet - jmm@cumin2002"
* 09:28 mvernon@cumin2002: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P<nowiki>{</nowiki>ms-fe[2009-2020].codfw.wmnet<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 09:28 Emperor: roll-restart codfw ms frontends prior to pooling new ones [[phab:T416243|T416243]]
* 09:26 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host durum4003.ulsfo.wmnet
* 09:23 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:23 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host tcp-proxy4003.ulsfo.wmnet
* 09:22 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host durum4003.ulsfo.wmnet
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts netflow4002.ulsfo.wmnet
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 09:01 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:56 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: netflow4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 08:51 slyngshede@dns1004: END - running authdns-update
* 08:50 slyngshede@dns1004: START - running authdns-update
* 08:37 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:32 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts netflow4002.ulsfo.wmnet
* 08:25 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1004.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 08:23 arnaudb@dns1004: END - running authdns-update
* 08:21 arnaudb@dns1004: START - running authdns-update
* 07:57 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir4004.ulsfo.wmnet
* 07:53 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncredir4004.ulsfo.wmnet
* 07:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ncredir4003.ulsfo.wmnet
* 07:45 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ncredir4003.ulsfo.wmnet
* 05:24 kart_: staging: machinetranslation: Optimize model loading and memory footprints ([[phab:T411058|T411058]])
* 05:19 kartik@deploy2002: helmfile [staging] DONE helmfile.d/services/machinetranslation: apply
* 05:16 kartik@deploy2002: helmfile [staging] START helmfile.d/services/machinetranslation: apply
* 02:16 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2005.codfw.wmnet with OS trixie
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 14s)
* 02:03 swfrench-wmf: reprepro include php-igbinary_3.2.16-4+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:59 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
* 01:52 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2005.codfw.wmnet with reason: host reimage
* 01:49 swfrench-wmf: reprepro include php-msgpack_3.0.0-1+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:47 swfrench-wmf: reprepro include php-apcu_5.1.24-1+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:37 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2005.codfw.wmnet with OS trixie
* 01:36 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-ctrl2004.codfw.wmnet with OS trixie
* 01:24 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7012.*
* 01:20 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 01:18 jasmine@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
* 01:18 rzl@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 01:15 jasmine@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-ctrl2004.codfw.wmnet with reason: host reimage
* 01:13 swfrench-wmf: reprepro include dh-php_5.5+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:08 swfrench-wmf: reprepro include php-defaults_94+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:05 rzl@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 01:05 rzl@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 01:03 swfrench-wmf: reprepro include php8.3_8.3.30-1+icu72+wmf11u1 into component/php83-icu72 - [[phab:T419058|T419058]]
* 01:00 jasmine@cumin2002: START - Cookbook sre.hosts.reimage for host wikikube-ctrl2004.codfw.wmnet with OS trixie
* 00:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7012.magru.wmnet with OS trixie
* 00:59 brett@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
* 00:58 brett@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - brett@cumin2002"
* 00:38 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 00:38 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 00:37 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 00:37 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 00:36 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: sync
* 00:36 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: sync
* 00:33 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: sync
* 00:29 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7012.magru.wmnet with reason: host reimage
* 00:27 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: sync
* 00:24 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7012.magru.wmnet with reason: host reimage
* 00:03 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7012.magru.wmnet with OS trixie
== 2026-03-11 ==
* 23:56 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7009.*
* 22:52 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:52 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen: apply
* 22:45 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 22:44 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 22:29 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 22:29 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 22:27 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7009.magru.wmnet with OS trixie
* 21:56 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: sync
* 21:55 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: sync
* 21:54 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250051{{!}}OrchestratorRequest: Switch evaluations to v2 endpoint (T413727)]] (duration: 18m 19s)
* 21:47 jforrester@deploy2002: jforrester: Continuing with sync
* 21:43 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:43 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7009.magru.wmnet with reason: host reimage
* 21:42 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:40 jforrester@deploy2002: jforrester: Backport for [[gerrit:1250051{{!}}OrchestratorRequest: Switch evaluations to v2 endpoint (T413727)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:39 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7009.magru.wmnet with reason: host reimage
* 21:35 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1250051{{!}}OrchestratorRequest: Switch evaluations to v2 endpoint (T413727)]]
* 21:30 rzl: rzl@apt1002:~$ sudo -i reprepro -C component/envoy-future include bullseye-wikimedia /home/rzl/envoyproxy_1.35.9-1_amd64.changes
* 21:29 arlolra@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250665{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]], [[gerrit:1250666{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]] (duration: 35m 16s)
* 21:16 arlolra@deploy2002: arlolra: Continuing with sync
* 21:15 arlolra@deploy2002: arlolra: Backport for [[gerrit:1250665{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]], [[gerrit:1250666{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:08 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7009.magru.wmnet with OS trixie
* 21:08 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7010.*
* 21:04 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7010.magru.wmnet with OS trixie
* 20:54 arlolra@deploy2002: Started scap sync-world: Backport for [[gerrit:1250665{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]], [[gerrit:1250666{{!}}Show category index when no category selected on Special:LintTemplateErrors (T417363)]]
* 20:47 jsn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250579{{!}}urwikisource: add logo, sitename and projectnamespace (T415974)]] (duration: 06m 55s)
* 20:43 jsn@deploy2002: anzx, jsn: Continuing with sync
* 20:42 jsn@deploy2002: anzx, jsn: Backport for [[gerrit:1250579{{!}}urwikisource: add logo, sitename and projectnamespace (T415974)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:40 jsn@deploy2002: Started scap sync-world: Backport for [[gerrit:1250579{{!}}urwikisource: add logo, sitename and projectnamespace (T415974)]]
* 20:38 jsn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250581{{!}}riskyArticleEdits: show page descriptions (T419442)]], [[gerrit:1250582{{!}}Fix Instrumentation on mobile view (T419517)]], [[gerrit:1250632{{!}}ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570)]] (duration: 10m 37s)
* 20:38 jhathaway@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 3:00:00 on ml-serve1014.eqiad.wmnet with reason: [[phab:T400626|T400626]]
* 20:37 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7010.magru.wmnet with reason: host reimage
* 20:34 jsn@deploy2002: jsn, sfaci: Continuing with sync
* 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search-test: apply
* 20:33 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search-test: apply
* 20:32 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7010.magru.wmnet with reason: host reimage
* 20:30 jsn@deploy2002: jsn, sfaci: Backport for [[gerrit:1250581{{!}}riskyArticleEdits: show page descriptions (T419442)]], [[gerrit:1250582{{!}}Fix Instrumentation on mobile view (T419517)]], [[gerrit:1250632{{!}}ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:28 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on gitlab1003.wikimedia.org with reason: Upgrade
* 20:28 aokoth@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2 days, 0:00:00 on gitlab2002.wikimedia.org with reason: Upgrade
* 20:27 jsn@deploy2002: Started scap sync-world: Backport for [[gerrit:1250581{{!}}riskyArticleEdits: show page descriptions (T419442)]], [[gerrit:1250582{{!}}Fix Instrumentation on mobile view (T419517)]], [[gerrit:1250632{{!}}ext.wikimediaEvents: Updated Test Kitchen impact test experiment (T407570)]]
* 20:21 andrew@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:18 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 20:17 bvibber@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250647{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]], [[gerrit:1250648{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]] (duration: 06m 47s)
* 20:13 bvibber@deploy2002: bvibber: Continuing with sync
* 20:12 bvibber@deploy2002: bvibber: Backport for [[gerrit:1250647{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]], [[gerrit:1250648{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:10 bvibber@deploy2002: Started scap sync-world: Backport for [[gerrit:1250647{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]], [[gerrit:1250648{{!}}Revert "Fix for temp section open during slow loads on Parsoid" (T416063 T419170 T419721)]]
* 19:59 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7010.magru.wmnet with OS trixie
* 19:54 andrew@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 19:51 andrew@cumin2002: START - Cookbook sre.dns.netbox
* 19:37 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-backup1004.eqiad.wmnet with OS trixie
* 19:01 robh@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts cp7011.magru.wmnet
* 19:01 robh@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts cp7011.magru.wmnet
* 18:56 sukhe@cumin1003: END (PASS) - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns (exit_code=0) rolling restart_daemons on A:wikidough
* 18:49 brennen@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 18:49 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:49 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:45 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:45 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:44 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:44 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 18:43 sukhe@cumin1003: START - Cookbook sre.dns.roll-restart-reboot-wikimedia-dns rolling restart_daemons on A:wikidough
* 18:42 brennen: 1.46.0-wmf.19 train status: no current blockers, going ahead to group1.
* 18:39 swfrench@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker2332.codfw.wmnet
* 18:37 swfrench@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker2332.codfw.wmnet
* 18:20 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7011.*
* 18:18 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 18:16 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-backup1004.eqiad.wmnet with OS trixie
* 18:13 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - btullis@cumin1003"
* 17:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1010.eqiad.wmnet with reason: host reimage
* 17:52 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:52 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:48 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:47 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1010.eqiad.wmnet with reason: host reimage
* 17:46 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 17:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 17:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 17:38 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:38 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:38 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:37 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:36 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:36 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:36 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:35 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 17:34 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 17:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: sync
* 17:31 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: sync
* 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-api-int: apply
* 17:28 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-api-int: apply
* 17:28 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-web: apply
* 17:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-web: apply
* 17:20 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:19 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:19 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:18 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:15 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 17:13 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-api-int: apply
* 17:12 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-api-int: apply
* 17:09 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-web: apply
* 17:09 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-web: apply
* 17:02 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7011.magru.wmnet with OS trixie
* 17:01 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 17:00 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 16:58 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum4004.ulsfo.wmnet with reason: in setup
* 16:58 jmm@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on durum4003.ulsfo.wmnet with reason: in setup
* 16:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 16:40 root@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:40 root@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moving many things from cloudgw2002-dev to cloudgw2004-dev - root@cumin2002"
* 16:40 root@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: moving many things from cloudgw2002-dev to cloudgw2004-dev - root@cumin2002"
* 16:39 btullis@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on dse-k8s-worker1011.eqiad.wmnet with reason: host reimage
* 16:37 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 16:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 16:35 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7011.magru.wmnet with reason: host reimage
* 16:35 root@cumin2002: START - Cookbook sre.dns.netbox
* 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts prometheus4002.ulsfo.wmnet
* 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:32 tappof@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - tappof@cumin1003"
* 16:30 tappof@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: prometheus4002.ulsfo.wmnet decommissioned, removing all IPs except the asset tag one - tappof@cumin1003"
* 16:30 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7011.magru.wmnet with reason: host reimage
* 16:25 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 16:23 tappof@cumin1003: START - Cookbook sre.dns.netbox
* 16:18 tappof@cumin1003: START - Cookbook sre.hosts.decommission for hosts prometheus4002.ulsfo.wmnet
* 15:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:52 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7011.magru.wmnet with OS trixie
* 15:51 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 15:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 15:50 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:49 urbanecm@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 15:48 sukhe: sudo cumin -b1 -s10 "C:dnsrecursor" "run-puppet-agent --enable 'merging CR 1250576'"
* 15:48 urbanecm@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 15:46 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:45 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 15:43 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab2002.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:39 sukhe: sudo cumin "C:dnsrecursor" "disable-puppet 'merging CR 1250576'"
* 15:35 aokoth@cumin1003: END (PASS) - Cookbook sre.gitlab.upgrade (exit_code=0) on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:26 aokoth@cumin1003: START - Cookbook sre.gitlab.upgrade on GitLab host gitlab1003.wikimedia.org with reason: Security Release - [[phab:T419712|T419712]]
* 15:08 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 15:08 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 15:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:53 swfrench-wmf: updated component/php83-icu72 with libpcre2 10.42-1~wmf11+1 from apt-staging - [[phab:T419058|T419058]]
* 14:46 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1011.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:45 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:41 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum4004.ulsfo.wmnet
* 14:41 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum4004.ulsfo.wmnet with OS trixie
* 14:39 vgutierrez: depool ncredir4003 && ncredir4004
* 14:38 vgutierrez: repool ncredir4001 && ncredir4002
* 14:31 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4002.ulsfo.wmnet
* 14:31 jmm@puppetserver1001: conftool action : set/pooled=no; selector: name=ncredir4001.ulsfo.wmnet
* 14:30 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4004.ulsfo.wmnet
* 14:30 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=ncredir4004.ulsfo.wmnet
* 14:27 jmm@puppetserver1001: conftool action : set/pooled=yes; selector: name=ncredir4003.ulsfo.wmnet
* 14:27 jmm@puppetserver1001: conftool action : set/weight=1; selector: name=ncredir4003.ulsfo.wmnet
* 14:24 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4004.ulsfo.wmnet with reason: host reimage
* 14:23 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:23 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:19 moritzm: installing python-urllib3 security updates
* 14:18 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum4004.ulsfo.wmnet with reason: host reimage
* 14:14 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:13 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:12 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:12 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:12 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:12 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:11 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:11 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:11 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 14:11 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:10 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:08 gkyziridis@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:08 gkyziridis@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:07 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250568{{!}}Fix pinnableElement export (T419620)]] (duration: 06m 26s)
* 14:06 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 14:05 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 14:04 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 14:04 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 14:03 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 14:03 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 14:03 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:02 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1250568{{!}}Fix pinnableElement export (T419620)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:00 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1250568{{!}}Fix pinnableElement export (T419620)]]
* 13:58 moritzm: uploaded libxml2 2.9.10+dfsg-6.7+deb11u9+wmf11u1 to component/php83-icu72 for bullseye-wikimedia (special build of libxml with ICU disabled to ensure co-installabiliy between icu 67 and icu 72) [[phab:T419058|T419058]]
* 13:57 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250566{{!}}Restore advanced main menu for AMC (T413912)]] (duration: 10m 44s)
* 13:55 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum4004.ulsfo.wmnet with OS trixie
* 13:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:54 vgutierrez: repool cp7016
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:54 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum4004.ulsfo.wmnet on all recursors
* 13:54 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum4004.ulsfo.wmnet on all recursors
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:54 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:51 jdlrobson@deploy2002: jdlrobson: Continuing with sync
* 13:50 jdlrobson@deploy2002: jdlrobson: Backport for [[gerrit:1250566{{!}}Restore advanced main menu for AMC (T413912)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:49 vgutierrez: depool cp7016
* 13:49 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4004.ulsfo.wmnet - jmm@cumin2002"
* 13:46 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1250566{{!}}Restore advanced main menu for AMC (T413912)]]
* 13:45 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:44 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:44 jdlrobson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247547{{!}}Remove `MetricsPlatform` configuration from production (T416865)]] (duration: 35m 52s)
* 13:43 btullis@cumin1003: START - Cookbook sre.hosts.provision for host dse-k8s-worker1010.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 13:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 13:42 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 13:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4004.ulsfo.wmnet with OS bookworm
* 13:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:36 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum4004.ulsfo.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host durum4003.ulsfo.wmnet
* 13:36 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host durum4003.ulsfo.wmnet with OS trixie
* 13:36 bking@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 13:35 bking@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-ipoid-test: apply
* 13:30 jdlrobson@deploy2002: jdlrobson, sfaci: Continuing with sync
* 13:29 jdlrobson@deploy2002: jdlrobson, sfaci: Backport for [[gerrit:1247547{{!}}Remove `MetricsPlatform` configuration from production (T416865)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:22 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 13:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on durum4003.ulsfo.wmnet with reason: host reimage
* 13:18 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 13:13 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on durum4003.ulsfo.wmnet with reason: host reimage
* 13:08 jdlrobson@deploy2002: Started scap sync-world: Backport for [[gerrit:1247547{{!}}Remove `MetricsPlatform` configuration from production (T416865)]]
* 13:00 moritzm: installing libcommons-lang3-java security updates
* 12:57 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4004.ulsfo.wmnet with OS bookworm
* 12:51 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4003.ulsfo.wmnet with OS bookworm
* 12:46 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host durum4003.ulsfo.wmnet with OS trixie
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:46 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:46 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) durum4003.ulsfo.wmnet on all recursors
* 12:45 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache durum4003.ulsfo.wmnet on all recursors
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 12:45 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:41 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM durum4003.ulsfo.wmnet - jmm@cumin2002"
* 12:37 moritzm: installing inetutils security updates
* 12:36 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 12:36 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host durum4003.ulsfo.wmnet
* 12:35 tappof: completed migration from prometheus4002 to prometheus4003 (ulsfo) (TT419430)
* 12:34 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 12:28 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 12:28 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 12:24 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2073.codfw.wmnet with OS bullseye
* 12:23 btullis@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 12:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host netflow4003.ulsfo.wmnet
* 12:18 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1011.eqiad.wmnet with OS bookworm
* 12:17 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1011
* 12:17 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1011
* 12:14 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2072.codfw.wmnet with OS bullseye
* 12:11 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host netflow4003.ulsfo.wmnet
* 12:05 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2073.codfw.wmnet with reason: host reimage
* 12:04 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4003.ulsfo.wmnet with OS bookworm
* 12:01 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
* 11:59 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
* 11:58 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2073.codfw.wmnet with reason: host reimage
* 11:54 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2072.codfw.wmnet with reason: host reimage
* 11:48 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2072.codfw.wmnet with reason: host reimage
* 11:41 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1239954{{!}}[Growth] Enable on every new Wikipedia by default (T304052)]] (duration: 06m 39s)
* 11:38 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2073
* 11:38 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2073
* 11:37 vgutierrez: upgrading to acme-chief 0.39 on acme-chief production instances - [[phab:T419352|T419352]]
* 11:37 urbanecm@deploy2002: urbanecm: Continuing with sync
* 11:36 urbanecm@deploy2002: urbanecm: Backport for [[gerrit:1239954{{!}}[Growth] Enable on every new Wikipedia by default (T304052)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:36 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2073
* 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2073.codfw.wmnet 212.48.192.10.in-addr.arpa 2.1.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:36 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2073.codfw.wmnet 212.48.192.10.in-addr.arpa 2.1.2.0.8.4.0.0.2.9.1.0.0.1.0.0.4.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:36 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2073 - mvernon@cumin2002"
* 11:36 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2073 - mvernon@cumin2002"
* 11:35 cgoubert@deploy2002: helmfile [codfw] DONE helmfile.d/services/api-gateway: apply
* 11:34 cgoubert@deploy2002: helmfile [codfw] START helmfile.d/services/api-gateway: apply
* 11:34 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1239954{{!}}[Growth] Enable on every new Wikipedia by default (T304052)]]
* 11:34 cgoubert@deploy2002: helmfile [eqiad] DONE helmfile.d/services/api-gateway: apply
* 11:34 urbanecm@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250539{{!}}[Growth] kaiwiki: Enable GrowthExperiments (T304052)]] (duration: 14m 11s)
* 11:33 cgoubert@deploy2002: helmfile [eqiad] START helmfile.d/services/api-gateway: apply
* 11:33 cgoubert@deploy2002: helmfile [staging] DONE helmfile.d/services/api-gateway: apply
* 11:32 cgoubert@deploy2002: helmfile [staging] START helmfile.d/services/api-gateway: apply
* 11:32 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:31 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2073
* 11:30 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2073.codfw.wmnet with OS bullseye
* 11:30 urbanecm@deploy2002: urbanecm: Continuing with sync
* 11:29 cgoubert@dns1004: END - running authdns-update
* 11:29 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2072
* 11:29 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2072
* 11:28 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2072
* 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2072.codfw.wmnet 158.32.192.10.in-addr.arpa 8.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:28 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2072.codfw.wmnet 158.32.192.10.in-addr.arpa 8.5.1.0.2.3.0.0.2.9.1.0.0.1.0.0.3.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 11:28 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2072 - mvernon@cumin2002"
* 11:28 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2072 - mvernon@cumin2002"
* 11:28 cgoubert@dns1004: START - running authdns-update
* 11:26 urbanecm@deploy2002: mwscript-k8s job started: WikimediaMaintenance:createExtensionTables.php --wiki=kaiwiki growthexperiments # [[phab:T304052|T304052]]
* 11:24 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 11:24 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2072
* 11:23 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2072.codfw.wmnet with OS bullseye
* 11:22 tappof@dns1004: END - running authdns-update
* 11:22 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 11:21 urbanecm@deploy2002: urbanecm: Backport for [[gerrit:1250539{{!}}[Growth] kaiwiki: Enable GrowthExperiments (T304052)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:21 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 11:21 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
* 11:21 tappof@dns1004: START - running authdns-update
* 11:21 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
* 11:19 urbanecm@deploy2002: Started scap sync-world: Backport for [[gerrit:1250539{{!}}[Growth] kaiwiki: Enable GrowthExperiments (T304052)]]
* 11:19 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 11:19 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2071.codfw.wmnet with OS bullseye
* 11:18 urbanecm@deploy2002: mwscript-k8s job started: WikimediaMaintenance:createExtensionTables.php --wiki=kaiwiki growthexperiments # [[phab:T304052|T304052]]
* 11:10 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/zotero: apply
* 11:10 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/zotero: apply
* 11:08 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 11:08 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 11:05 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/zotero: apply
* 11:05 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/zotero: apply
* 10:58 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2071.codfw.wmnet with reason: host reimage
* 10:54 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2071.codfw.wmnet with reason: host reimage
* 10:35 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host ms-be2071
* 10:35 mvernon@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be2071
* 10:34 mvernon@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-be2071
* 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ms-be2071.codfw.wmnet 221.16.192.10.in-addr.arpa 1.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:34 mvernon@cumin2002: START - Cookbook sre.dns.wipe-cache ms-be2071.codfw.wmnet 221.16.192.10.in-addr.arpa 1.2.2.0.6.1.0.0.2.9.1.0.0.1.0.0.2.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:34 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2071 - mvernon@cumin2002"
* 10:34 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host ms-be2071 - mvernon@cumin2002"
* 10:26 mvernon@cumin2002: START - Cookbook sre.dns.netbox
* 10:24 mvernon@cumin2002: START - Cookbook sre.hosts.move-vlan for host ms-be2071
* 10:23 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2071.codfw.wmnet with OS bullseye
* 10:08 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2095.codfw.wmnet with OS bullseye
* 10:03 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Failed step after ml-serve1015's reimage - elukey@cumin1003"
* 10:02 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Failed step after ml-serve1015's reimage - elukey@cumin1003"
* 10:01 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1015.eqiad.wmnet with OS trixie
* 10:01 elukey@cumin1003: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 09:59 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2096.codfw.wmnet with OS bullseye
* 09:52 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2096.codfw.wmnet with OS bullseye
* 09:52 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:51 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:51 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be2095.codfw.wmnet with OS bullseye
* 09:51 mvernon@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:46 mvernon@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - mvernon@cumin2002"
* 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 09:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 09:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 09:34 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 09:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 09:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 09:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 09:32 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 09:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 09:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 09:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 09:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 09:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 09:28 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 09:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 09:27 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 09:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 09:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 09:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 09:24 mvernon@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 09:22 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir4004.ulsfo.wmnet
* 09:19 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4004.ulsfo.wmnet with OS bookworm
* 09:15 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:15 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:14 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 09:10 javiermonton@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249217{{!}}stream: mediawiki.page_html_content_change (T419258)]] (duration: 08m 28s)
* 09:07 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2096.codfw.wmnet with OS bullseye
* 09:06 javiermonton@deploy2002: javiermonton: Continuing with sync
* 09:03 mvernon@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
* 09:03 javiermonton@deploy2002: javiermonton: Backport for [[gerrit:1249217{{!}}stream: mediawiki.page_html_content_change (T419258)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 09:01 javiermonton@deploy2002: Started scap sync-world: Backport for [[gerrit:1249217{{!}}stream: mediawiki.page_html_content_change (T419258)]]
* 08:59 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1015.eqiad.wmnet with reason: host reimage
* 08:58 trueg@deploy2002: helmfile [staging] DONE helmfile.d/services/SERVICE_NAME: apply
* 08:58 trueg@deploy2002: helmfile [staging] START helmfile.d/services/SERVICE_NAME: apply
* 08:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4004.ulsfo.wmnet with reason: host reimage
* 08:55 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2239.codfw.wmnet with reason: mysql upgrade / restart
* 08:54 moritzm: installing imagemagick security updates
* 08:52 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1015.eqiad.wmnet with reason: host reimage
* 08:41 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1015.eqiad.wmnet with OS trixie
* 08:40 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ml-serve1014.eqiad.wmnet with OS trixie
* 08:40 elukey@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:39 elukey@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - elukey@cumin1003"
* 08:35 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4004.ulsfo.wmnet with OS bookworm
* 08:32 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:32 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir4004.ulsfo.wmnet on all recursors
* 08:31 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir4004.ulsfo.wmnet on all recursors
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 08:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:25 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4004.ulsfo.wmnet - jmm@cumin2002"
* 08:23 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ml-serve1014.eqiad.wmnet with reason: host reimage
* 08:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:21 Msz2001: UTC morning backport window finished
* 08:21 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 08:21 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:21 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir4004.ulsfo.wmnet
* 08:21 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250426{{!}}Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages]] (duration: 10m 46s)
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host ncredir4003.ulsfo.wmnet
* 08:21 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ncredir4003.ulsfo.wmnet with OS bookworm
* 08:17 elukey@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ml-serve1014.eqiad.wmnet with reason: host reimage
* 08:15 mszwarc@deploy2002: mszwarc: Continuing with sync
* 08:14 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1250426{{!}}Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:10 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1250426{{!}}Drop underscore from titles in wgOATH2FARequiredGroupRemovalPages]]
* 08:09 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249921{{!}}Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422)]], [[gerrit:1250066{{!}}Send2FAWarningNotifications: Support reading users from file (T419111)]] (duration: 33m 07s)
* 08:05 moritzm: installing mariadb bugfix updates from Bookworm point release (tools and libraries as packaged in Debian, unrelated to the wmf-mariadb packages)
* 08:04 elukey@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1014.eqiad.wmnet with OS trixie
* 08:04 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 07:58 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ncredir4003.ulsfo.wmnet with reason: host reimage
* 07:57 mszwarc@deploy2002: mszwarc: Continuing with sync
* 07:56 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1249921{{!}}Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422)]], [[gerrit:1250066{{!}}Send2FAWarningNotifications: Support reading users from file (T419111)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:44 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.drain-node (exit_code=99) for draining ganeti node ganeti1049.eqiad.wmnet
* 07:44 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1049.eqiad.wmnet
* 07:43 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1033.eqiad.wmnet
* 07:42 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1033.eqiad.wmnet
* 07:38 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ncredir4003.ulsfo.wmnet with OS bookworm
* 07:36 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1249921{{!}}Display list of 2FA-req. groups on AccountSecurity for 2FA-less users (T419422)]], [[gerrit:1250066{{!}}Send2FAWarningNotifications: Support reading users from file (T419111)]]
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) ncredir4003.ulsfo.wmnet on all recursors
* 07:34 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache ncredir4003.ulsfo.wmnet on all recursors
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:34 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:34 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM ncredir4003.ulsfo.wmnet - jmm@cumin2002"
* 07:27 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:27 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host ncredir4003.ulsfo.wmnet
* 07:22 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247639{{!}}Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727)]] (duration: 12m 24s)
* 07:18 kgraessle@deploy2002: kgraessle: Continuing with sync
* 07:12 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1247639{{!}}Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:09 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1247639{{!}}Enable rr-ml AutoModerator CC Set AutoModeratorMultiLingualRevertRisk with available wikis (T400727)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 59s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 00:33 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1250117{{!}}Stop setting $wgImageLinksSchemaMigrationStage (T299953)]] (duration: 09m 38s)
* 00:29 zabe@deploy2002: zabe: Continuing with sync
* 00:26 zabe@deploy2002: zabe: Backport for [[gerrit:1250117{{!}}Stop setting $wgImageLinksSchemaMigrationStage (T299953)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:24 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1250117{{!}}Stop setting $wgImageLinksSchemaMigrationStage (T299953)]]
* 00:03 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 00:03 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint1003.wikimedia.org with OS trixie
* 00:03 vriley@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
* 00:03 vriley@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - vriley@cumin1003"
== 2026-03-10 ==
* 23:58 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2095.codfw.wmnet with reason: host reimage
* 23:53 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 23:49 jhancock@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be2096.codfw.wmnet with reason: host reimage
* 23:44 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint1003.wikimedia.org with reason: host reimage
* 23:40 vriley@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on contint1003.wikimedia.org with reason: host reimage
* 23:31 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2096.codfw.wmnet with OS bullseye
* 23:31 jhancock@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
* 23:26 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2095.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:24 jhancock@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be2096.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:22 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 23:22 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:12 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:11 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2096.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 23:05 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 23:05 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:59 jhancock@cumin1003: START - Cookbook sre.hosts.provision for host ms-be2095.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 22:39 cjming@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 22:38 cjming@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/test-kitchen-next: apply
* 21:51 brett@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cp7012.magru.wmnet with OS trixie
* 21:48 Dreamy_Jazz: Evening UTC backport window done
* 21:42 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7006.magru.wmnet [reason: trixie reimaging]
* 21:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7006.magru.wmnet with OS trixie
* 21:25 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1235552{{!}}Migrate EmailAuth, step 2 (T404334)]] (duration: 25m 34s)
* 21:24 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7007.magru.wmnet [reason: trixie reimaging]
* 21:22 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7007.magru.wmnet with OS trixie
* 21:21 tgr@deploy2002: tgr: Continuing with sync
* 21:13 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 21:09 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 21:02 tgr@deploy2002: tgr: Backport for [[gerrit:1235552{{!}}Migrate EmailAuth, step 2 (T404334)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:00 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1235552{{!}}Migrate EmailAuth, step 2 (T404334)]]
* 20:59 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7012.magru.wmnet with OS trixie
* 20:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7007.magru.wmnet with reason: host reimage
* {{safesubst:SAL entry|1=20:50 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249983{{!}}Deploy participant recruitment survey on ptwiki and trwiki (T419275)]], [[gerrit:1238733{{!}}wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402)]], [[gerrit:1238734{{!}}wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403)]], [[gerrit:1249393{{!}}build: Upgrade mediawiki-phan-config from 0.18.0 to 0.2}}
* 20:48 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7007.magru.wmnet with reason: host reimage
* 20:46 jforrester@deploy2002: dani, jforrester: Continuing with sync
* {{safesubst:SAL entry|1=20:45 jforrester@deploy2002: dani, jforrester: Backport for [[gerrit:1249983{{!}}Deploy participant recruitment survey on ptwiki and trwiki (T419275)]], [[gerrit:1238733{{!}}wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402)]], [[gerrit:1238734{{!}}wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403)]], [[gerrit:1249393{{!}}build: Upgrade mediawiki-phan-config from 0.18.0 to 0.20.0 (T41}}
* 20:43 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7006.magru.wmnet with OS trixie
* {{safesubst:SAL entry|1=20:43 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1249983{{!}}Deploy participant recruitment survey on ptwiki and trwiki (T419275)]], [[gerrit:1238733{{!}}wikifunctions: Drop temporary WikifunctionsEnableHTMLOutput flag (T397402)]], [[gerrit:1238734{{!}}wikifunctions: Drop temporary WikifunctionsEnableWikidataInputTypes flag (T397403)]], [[gerrit:1249393{{!}}build: Upgrade mediawiki-phan-config from 0.18.0 to 0.20}}
* 20:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7006.magru.wmnet with OS trixie
* 20:42 cdobbins@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - cdobbins@cumin2002"
* 20:38 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240012{{!}}Enable personal main menu to all users in Minerva Neue skin (T413912)]], [[gerrit:1250007{{!}}Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592)]], [[gerrit:1250015{{!}}Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439)]] (duration: 12m 58s)
* 20:36 cdobbins@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - cdobbins@cumin2002"
* 20:34 jforrester@deploy2002: jforrester, cscott, bwang: Continuing with sync
* 20:27 jforrester@deploy2002: jforrester, cscott, bwang: Backport for [[gerrit:1240012{{!}}Enable personal main menu to all users in Minerva Neue skin (T413912)]], [[gerrit:1250007{{!}}Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592)]], [[gerrit:1250015{{!}}Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439)]] synced to the testservers (see https://wikitech.wi
* 20:25 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1240012{{!}}Enable personal main menu to all users in Minerva Neue skin (T413912)]], [[gerrit:1250007{{!}}Enables legacy processing in ParserOutputPostCacheTransform when cached (T372592)]], [[gerrit:1250015{{!}}Parser: Raise minimum TTL from 30 min to 'next midnight' in miser mode (T416616 T416540 T419439)]]
* 20:25 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7007.magru.wmnet with OS trixie
* 20:24 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7007.magru.wmnet [reason: trixie reimaging]
* 20:24 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7005.magru.wmnet [reason: trixie reimaging]
* 20:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7005.magru.wmnet with OS trixie
* 20:10 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 20:03 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7013.*
* 20:03 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7006.magru.wmnet with reason: host reimage
* 19:50 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7013.magru.wmnet with OS trixie
* 19:49 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7005.magru.wmnet with reason: host reimage
* 19:42 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7005.magru.wmnet with reason: host reimage
* 19:40 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7006.magru.wmnet with OS trixie
* 19:40 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7006.magru.wmnet [reason: trixie reimaging]
* 19:39 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7004.magru.wmnet [reason: trixie reimaging]
* 19:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7013.magru.wmnet with reason: host reimage
* 19:19 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7005.magru.wmnet with OS trixie
* 19:19 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7004.magru.wmnet with OS trixie
* 19:19 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7005.magru.wmnet [reason: trixie reimaging]
* 19:18 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7013.magru.wmnet with reason: host reimage
* 19:17 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
* 19:16 brennen@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 19:16 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7003.magru.wmnet with OS trixie
* 19:09 brennen: 1.46.0-wmf.19 train status: blockers believed resolved, rolling to group0
* 19:07 brennen@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249937{{!}}Re-add correct namespace for translatable pages (T419294)]] (duration: 12m 30s)
* 19:06 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 19:01 brennen@deploy2002: abi, brennen: Continuing with sync
* 18:58 brennen@deploy2002: abi, brennen: Backport for [[gerrit:1249937{{!}}Re-add correct namespace for translatable pages (T419294)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:55 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7013.magru.wmnet with OS trixie
* 18:54 brennen@deploy2002: Started scap sync-world: Backport for [[gerrit:1249937{{!}}Re-add correct namespace for translatable pages (T419294)]]
* 18:52 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7004.magru.wmnet with reason: host reimage
* 18:52 brennen@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.19 refs [[phab:T413810|T413810]] (duration: 38m 34s)
* 18:49 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7004.magru.wmnet with reason: host reimage
* 18:47 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7003.magru.wmnet with reason: host reimage
* 18:44 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7003.magru.wmnet with reason: host reimage
* 18:41 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7015.*
* 18:27 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7015.magru.wmnet with OS trixie
* 18:23 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7004.magru.wmnet with OS trixie
* 18:21 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7004.magru.wmnet [reason: trixie reimaging]
* 18:16 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7003.magru.wmnet with OS trixie
* 18:13 brennen@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.19 refs [[phab:T413810|T413810]]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
* 18:13 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7003.magru.wmnet [reason: trixie reimaging]
* 18:00 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:59 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:59 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7015.magru.wmnet with reason: host reimage
* 17:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 17:59 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 17:56 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7015.magru.wmnet with reason: host reimage
* 17:54 hashar@deploy2002: Finished deploy [integration/docroot@f544f49]: Catch up with composer/npm dev dependencies. Noop for production (duration: 00m 11s)
* 17:54 hashar@deploy2002: Started deploy [integration/docroot@f544f49]: Catch up with composer/npm dev dependencies. Noop for production
* 17:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 17:43 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 17:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:32 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:31 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:30 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7015.magru.wmnet with OS trixie
* 17:30 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:29 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:28 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:28 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 17:28 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:26 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:24 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:23 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 17:22 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 17:21 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 17:12 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:12 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:11 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:11 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:09 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 17:09 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 17:01 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 16:40 andrew@dns1004: END - running authdns-update
* 16:38 andrew@dns1004: START - running authdns-update
* 16:25 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249993{{!}}Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled"]] (duration: 07m 45s)
* 16:21 reedy@deploy2002: reedy: Continuing with sync
* 16:19 reedy@deploy2002: reedy: Backport for [[gerrit:1249993{{!}}Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 16:17 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1249993{{!}}Revert "CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled"]]
* 15:59 jynus@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 15:59 taavi: update cr firewall policy for codfw1dev ldap tree https://gerrit.wikimedia.org/r/1249985
* 15:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/postgresql-airflow-fr-tech: apply
* 15:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/postgresql-airflow-fr-tech: apply
* 15:55 jynus@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 15:48 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:46 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:39 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:34 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 15:28 brouberol@dns1004: END - running authdns-update
* 15:27 brouberol@dns1004: START - running authdns-update
* 15:10 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.hiddenparma (exit_code=0) Hiddenparma deployment to the alerting hosts with reason: "Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002"
* 15:10 swfrench@cumin2002: END (PASS) - Cookbook sre.deploy.python-code (exit_code=0) hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002
* 15:09 swfrench@cumin2002: START - Cookbook sre.deploy.python-code hiddenparma to alert[1002,2002].wikimedia.org with reason: Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002
* 15:09 swfrench@cumin2002: START - Cookbook sre.deploy.hiddenparma Hiddenparma deployment to the alerting hosts with reason: "Deploy: Fix an edge case in validation of a new object - swfrench@cumin2002"
* 15:05 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:59 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:58 sukhe: sudo cumin -b1 -s15 "C:bird" "run-puppet-agent --enable 'merging CR 1238007; add function return type'"
* 14:58 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:58 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:58 sukhe: sudo cumin -b1 -s15 "C:bird" "run-puppet-agent 'merging CR 1238007; add function return type'"
* 14:51 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:44 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 14:42 sukhe: sudo cumin "C:bird" "disable-puppet 'merging CR 1238007; add function return type'"
* 14:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.powercycle (exit_code=0) for host ml-serve1014
* 14:39 elukey@cumin1003: START - Cookbook sre.hosts.provision for host thanos-be1006.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:36 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host ml-serve1014
* 14:36 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.powercycle (exit_code=99) for host ml-serve1014
* 14:36 elukey@cumin1003: START - Cookbook sre.hosts.powercycle for host ml-serve1014
* 14:12 otto@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249367{{!}}stream: mediawiki.page_edit_type_simple.dev0 (T351225)]] (duration: 11m 05s)
* 14:08 otto@deploy2002: akhatun, otto: Continuing with sync
* 14:02 otto@deploy2002: akhatun, otto: Backport for [[gerrit:1249367{{!}}stream: mediawiki.page_edit_type_simple.dev0 (T351225)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:01 otto@deploy2002: Started scap sync-world: Backport for [[gerrit:1249367{{!}}stream: mediawiki.page_edit_type_simple.dev0 (T351225)]]
* 13:49 vriley@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 13:43 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:42 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich: apply
* 13:28 vgutierrez: testing acme-chief 0.39 in acmechief-test2001 - [[phab:T419352|T419352]]
* 13:27 vgutierrez: upload acme-chief 0.39 to bookworm-wikimedia (apt.wm.o) - [[phab:T419352|T419352]]
* 13:16 jiji@cumin1003: END (FAIL) - Cookbook sre.memcached.roll-reboot-restart (exit_code=1) rolling restart_daemons on A:memcached-canary
* 13:16 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling restart_daemons on A:memcached-canary
* 13:12 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249903{{!}}Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580)]], [[gerrit:1249035{{!}}kaiwiki: add logo, stiename, projectnamespace and timezone (T414237)]] (duration: 08m 45s)
* 13:08 mszwarc@deploy2002: mszwarc, anzx: Continuing with sync
* 13:05 mszwarc@deploy2002: mszwarc, anzx: Backport for [[gerrit:1249903{{!}}Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580)]], [[gerrit:1249035{{!}}kaiwiki: add logo, stiename, projectnamespace and timezone (T414237)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:03 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1249903{{!}}Require 2FA from 6 other user groups ($wgRestrictedGroups) (T418580)]], [[gerrit:1249035{{!}}kaiwiki: add logo, stiename, projectnamespace and timezone (T414237)]]
* 13:03 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling reboot on A:memcached-gutter-eqiad
* 12:57 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1015.eqiad.wmnet with OS bookworm
* 12:56 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling reboot on A:memcached-gutter-eqiad
* 12:51 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ml-serve1014.eqiad.wmnet with OS bookworm
* 12:50 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ml-serve1014
* 12:50 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ml-serve1014
* 12:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:49 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:49 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:49 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:48 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:47 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:45 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:44 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1014.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:42 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:42 jiji@cumin1003: END (PASS) - Cookbook sre.memcached.roll-reboot-restart (exit_code=0) rolling restart_daemons on A:memcached-canary
* 12:42 jiji@cumin1003: START - Cookbook sre.memcached.roll-reboot-restart rolling restart_daemons on A:memcached-canary
* 12:31 jclark@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:31 jclark@cumin1003: START - Cookbook sre.hosts.provision for host ml-serve1015.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART
* 12:24 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 12:10 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 12:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 11:59 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-be1095.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 11:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-fe2024.codfw.wmnet with OS bullseye
* 11:34 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1003"
* 11:17 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - ayounsi@cumin1003"
* 11:15 Emperor: rebalance codfw swift rings [[phab:T354872|T354872]]
* 10:53 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-fe2024.codfw.wmnet with reason: host reimage
* 10:47 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-fe2024.codfw.wmnet with reason: host reimage
* 10:31 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host ms-fe2024.codfw.wmnet with OS bullseye
* 10:30 ayounsi@cumin1003: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host ms-fe2024.codfw.wmnet with OS bullseye
* 10:20 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host ms-fe2024.codfw.wmnet with OS bullseye
* 10:17 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 09:32 ayounsi@cumin1003: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device cr2-eqdfw
* 09:31 ayounsi@cumin1003: START - Cookbook sre.network.tls for network device cr2-eqdfw
* 09:22 derick@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=loginwiki --logwiki=metawiki TMPRI1975 FondueFanatic # [[phab:T419499|T419499]]
* 09:00 arnaudb@dns1005: END - running authdns-update
* 09:00 godog: restore all host interfaces - [[phab:T417393|T417393]]
* 08:58 arnaudb@dns1005: START - running authdns-update
* 08:30 godog: disabled interface for cloudcephmon1004 - [[phab:T417393|T417393]]
* 08:22 godog: disabled interfaces for cloudcephosd1021 cloudcephosd1042 cloudcephosd1043 cloudcephosd1018 cloudcephosd1022 - [[phab:T417393|T417393]]
* 08:18 godog: disabled interfaces for cloudcephosd1016 cloudcephosd1017 cloudcephosd1016 cloudcephosd1018 cloudcephosd1017 cloudcephosd1035 - [[phab:T417393|T417393]]
* 08:05 godog: start disabling cloudcephosd interfaces - [[phab:T417393|T417393]]
* 07:49 godog: prep cloudsw reboot tests 'ceph osd set noout' - [[phab:T417393|T417393]]
* 07:41 filippo@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 4:00:00 on 19 hosts with reason: switch down tests
* 06:14 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host wdqs2009.codfw.wmnet with OS bookworm
* 04:09 pt1979@cumin2002: END (FAIL) - Cookbook sre.network.tls (exit_code=99) for network device asw1-23-ulsfo
* 04:08 pt1979@cumin2002: START - Cookbook sre.network.tls for network device asw1-23-ulsfo
* 04:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.16 (duration: 01m 48s)
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 10s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:37 ryankemper: [WDQS] [[phab:T410573|T410573]] repooled wdqs1011.eqiad.wmnet - erroneously depooled since `2025-11-19` by failed `sre.wdqs.reboot` cookbook
* 00:42 vriley@cumin1003: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 00:39 vriley@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 00:29 vriley@cumin1003: START - Cookbook sre.hosts.provision for host contint1003.mgmt.eqiad.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
== 2026-03-09 ==
* 22:51 rzl: root@apt1002:~# reprepro --noskipold --restrict vopsbot update bookworm-wikimedia
* 22:34 bking@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM dse-k8s-ctrl1001.eqiad.wmnet
* 22:32 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1001.eqiad.wmnet
* 22:30 bking@cumin2002: END (PASS) - Cookbook sre.ganeti.reboot-vm (exit_code=0) for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:29 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: END (FAIL) - Cookbook sre.ganeti.reboot-vm (exit_code=99) for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:28 bking@cumin2002: START - Cookbook sre.ganeti.reboot-vm for VM dse-k8s-ctrl1002.eqiad.wmnet
* 22:03 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 22:02 alexsanford: Redeployed security fix for [[phab:T419186|T419186]]
* 21:44 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:40 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:37 cdobbins@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7002.magru.wmnet
* 21:34 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7002.magru.wmnet with OS trixie
* 21:29 alexsanford: Deployed security fix for [[phab:T419186|T419186]]
* 21:22 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 21:21 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 21:17 dani@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249370{{!}}Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]] (duration: 08m 15s)
* 21:13 dani@deploy2002: dani: Continuing with sync
* 21:11 dani@deploy2002: dani: Backport for [[gerrit:1249370{{!}}Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:09 dani@deploy2002: Started scap sync-world: Backport for [[gerrit:1249370{{!}}Pre-deploy participant recruitment survey on ptwiki and trwiki (T419275)]]
* 21:08 andrew@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:05 cdobbins@cumin2002: END (FAIL) - Cookbook sre.hosts.downtime (exit_code=99) for 2:00:00 on cp7002.magru.wmnet with reason: host reimage
* 21:02 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7002.magru.wmnet with reason: host reimage
* 21:01 andrew@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cloudgw2004-dev.codfw.wmnet with reason: host reimage
* 21:01 tgr_: removed private code for [[phab:T397244|T397244]]
* 21:01 ryankemper: [WDQS] Alright, these are re-entering a failed state soon enough that we will need to identify the offender if we want to restore proper service. We could put some temporary hack to restart every few minutes so we at least maintain some uptime, but root cause is the usual 'we need a requestctl rule to block whoever's killing us' scenario
* 21:00 cdobbins@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7001.magru.wmnet [reason: Trixie reimaging]
* 20:57 ryankemper: [WDQS] Auto-remediation would have eventually restarted these, but some of them were staying below our current threshold of `threads > 1200`. May want to lower threshold, or examine an additional metric-type to look at in the future
* 20:56 ryankemper: [WDQS] `ryankemper@cumin2002:~$ sudo -E cumin 'A:wdqs-main AND P<nowiki>{</nowiki>wdqs1*<nowiki>}</nowiki>' 'systemctl restart wdqs-blazegraph'`
* 20:54 ryankemper: [WDQS] `ryankemper@cumin2002:~$ sudo -E cumin 'A:wdqs-main AND P<nowiki>{</nowiki>wdqs2*<nowiki>}</nowiki>' 'systemctl restart wdqs-blazegraph'`
* 20:44 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudgw2004-dev.codfw.wmnet with OS trixie
* 20:43 tgr@deploy2002: Unlocked for deployment [MediaWiki]: working on private change (duration: 10m 10s)
* 20:36 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7002.magru.wmnet with OS trixie
* 20:33 tgr@deploy2002: Locking from deployment [MediaWiki]: working on private change
* 20:31 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247119{{!}}Enable parser survey for opted-out users on German/French/Polish wikis (T414852)]], [[gerrit:1249316{{!}}lift IP cap for womens month editathon (T419109)]] (duration: 13m 36s)
* 20:27 tgr@deploy2002: cscott, tgr, anzx: Continuing with sync
* 20:19 tgr@deploy2002: cscott, tgr, anzx: Backport for [[gerrit:1247119{{!}}Enable parser survey for opted-out users on German/French/Polish wikis (T414852)]], [[gerrit:1249316{{!}}lift IP cap for womens month editathon (T419109)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:17 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1247119{{!}}Enable parser survey for opted-out users on German/French/Polish wikis (T414852)]], [[gerrit:1249316{{!}}lift IP cap for womens month editathon (T419109)]]
* 20:13 aaron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249363{{!}}Remove redundant math spec file from wwwportal (T418188)]] (duration: 06m 56s)
* 20:09 aaron@deploy2002: aaron: Continuing with sync
* 20:08 aaron@deploy2002: aaron: Backport for [[gerrit:1249363{{!}}Remove redundant math spec file from wwwportal (T418188)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:06 aaron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249363{{!}}Remove redundant math spec file from wwwportal (T418188)]]
* 20:01 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp7016.*
* 19:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7001.magru.wmnet with OS trixie
* 19:51 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7016.magru.wmnet with OS trixie
* 19:49 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248911{{!}}Stop writing to il_to on commonswiki (T415787)]] (duration: 06m 04s)
* 19:45 zabe@deploy2002: zabe: Continuing with sync
* 19:44 zabe@deploy2002: zabe: Backport for [[gerrit:1248911{{!}}Stop writing to il_to on commonswiki (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:43 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248911{{!}}Stop writing to il_to on commonswiki (T415787)]]
* 19:29 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 19:28 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 19:28 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7001.magru.wmnet with reason: host reimage
* 19:24 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7016.magru.wmnet with reason: host reimage
* 19:23 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7001.magru.wmnet with reason: host reimage
* 19:19 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7016.magru.wmnet with reason: host reimage
* 19:15 cwhite@deploy2002: Finished deploy [performance/arc-lamp@aa8da8b]: {{Gerrit|Ie7e0355f89294a2927f9dbc28afec3a62d1752de}} (duration: 00m 08s)
* 19:15 cwhite@deploy2002: Started deploy [performance/arc-lamp@aa8da8b]: {{Gerrit|Ie7e0355f89294a2927f9dbc28afec3a62d1752de}}
* 19:14 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 19:14 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 19:05 herron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249365{{!}}udp2log: switch to new hosts]] (duration: 09m 38s)
* 19:03 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:03 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:01 herron@deploy2002: herron: Continuing with sync
* 19:00 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:00 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 18:59 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 18:59 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 18:59 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 18:59 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 18:57 herron@deploy2002: herron: Backport for [[gerrit:1249365{{!}}udp2log: switch to new hosts]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:57 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7001.magru.wmnet with OS trixie
* 18:55 herron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249365{{!}}udp2log: switch to new hosts]]
* 18:55 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp7016.magru.wmnet with OS trixie
* 18:50 btullis@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 18:49 btullis@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 18:44 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 18:44 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 18:34 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:34 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:33 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:33 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:32 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 18:32 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 18:29 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:29 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:27 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: sync
* 18:27 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: sync
* 18:23 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-debug: apply
* 18:23 dzahn@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host contint1003.wikimedia.org with OS trixie
* 18:23 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/mw-debug: apply
* 18:16 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-debug: apply
* 18:16 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-debug: apply
* 18:06 bking@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-ctrl1001.eqiad.wmnet
* 18:05 herron@deploy2002: Sync cancelled.
* 18:04 herron@deploy2002: herron: Backport for [[gerrit:1249361{{!}}Revert "udp2log: switch to new hosts"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 18:02 herron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249361{{!}}Revert "udp2log: switch to new hosts"]]
* 18:01 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host dse-k8s-ctrl1001.eqiad.wmnet
* 17:54 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:47 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:42 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:42 herron@deploy2002: Sync cancelled.
* 17:40 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:39 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:38 mutante: contint1003 - unable to get uptime Caused by: Cumin execution failed (exit_code=2) [101/240] - attempted manual powercycle - Initializing Firmware Interfaces... blank screen [[phab:T418544|T418544]]
* 17:34 mutante: contint1003.mgmt - racadm serveraction powercycle [[phab:T418544|T418544]] - not reacting
* 17:25 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:25 herron@deploy2002: herron: Backport for [[gerrit:1249332{{!}}udp2log: switch to new hosts (T417002)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:23 herron@deploy2002: Started scap sync-world: Backport for [[gerrit:1249332{{!}}udp2log: switch to new hosts (T417002)]]
* 17:19 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:15 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.makevm (exit_code=0) for new host netflow4003.ulsfo.wmnet
* 17:15 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host netflow4003.ulsfo.wmnet with OS bookworm
* 17:13 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:08 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 17:07 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-test2001.codfw.wmnet
* 17:03 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint1003.wikimedia.org with OS trixie
* 17:02 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-test2001.codfw.wmnet
* 17:00 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Managing sanitization for wikis kaiwiki in section s5
* 16:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on netflow4003.ulsfo.wmnet with reason: host reimage
* 16:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on netflow4003.ulsfo.wmnet with reason: host reimage
* 16:37 moritzm: installing gnupg security updates
* 16:31 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host netflow4003.ulsfo.wmnet with OS bookworm
* 16:31 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) netflow4003.ulsfo.wmnet on all recursors
* 16:30 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache netflow4003.ulsfo.wmnet on all recursors
* 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 16:30 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:30 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM netflow4003.ulsfo.wmnet - jmm@cumin2002"
* 16:26 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 16:26 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host netflow4003.ulsfo.wmnet
* 16:26 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 16:16 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 15:58 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on prometheus4003.ulsfo.wmnet with reason: host reimage
* 15:53 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on prometheus4003.ulsfo.wmnet with reason: host reimage
* 15:44 vgutierrez: vgutierrez@acmechief-test2001:~$ sudo -i systemctl disable reload-acme-chief-backend.timer - [[phab:T419352|T419352]]
* 15:40 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 15:37 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 15:36 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 15:30 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 15:26 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 15:24 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 15:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
* 15:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
* 15:12 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 15:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
* 15:08 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 15:03 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
* 14:50 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bookworm
* 14:49 bking@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host wdqs2009.codfw.wmnet with OS bullseye
* 14:45 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 14:35 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249291{{!}}Hide 2fa-warning Echo category from preferences (T419111)]] (duration: 06m 07s)
* 14:35 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis kaiwiki in section s5
* 14:34 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.sanitize-wiki (exit_code=99) Managing sanitization for wikis urwikisource in section s5
* 14:31 mszwarc@deploy2002: mszwarc: Continuing with sync
* 14:31 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1249291{{!}}Hide 2fa-warning Echo category from preferences (T419111)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:30 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Managing sanitization for wikis urwikisource in section s5
* 14:29 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1249291{{!}}Hide 2fa-warning Echo category from preferences (T419111)]]
* 14:25 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.sanitize-wiki (exit_code=0) Checking sanitization for wikis urwikisource in section s5
* 14:22 fceratto@cumin1003: START - Cookbook sre.mysql.sanitize-wiki Checking sanitization for wikis urwikisource in section s5
* 14:20 bking@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 14:15 bking@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 14:15 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249243{{!}}JS SDK: Add getExperimentByPrefix() (T419191)]], [[gerrit:1249242{{!}}ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191)]] (duration: 09m 39s)
* 14:11 phuedx@deploy2002: phuedx: Continuing with sync
* 14:07 phuedx@deploy2002: phuedx: Backport for [[gerrit:1249243{{!}}JS SDK: Add getExperimentByPrefix() (T419191)]], [[gerrit:1249242{{!}}ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:05 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1249243{{!}}JS SDK: Add getExperimentByPrefix() (T419191)]], [[gerrit:1249242{{!}}ext.wikimediaEvents: pageVisit -> loggedOutReaderRetention (T419191)]]
* 14:03 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 13:55 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 13:54 bking@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bullseye
* 13:50 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249262{{!}}Disable MetricsPlatform extension (T416865)]] (duration: 08m 02s)
* 13:49 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 13:46 phuedx@deploy2002: phuedx, sfaci: Continuing with sync
* 13:44 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:44 phuedx@deploy2002: phuedx, sfaci: Backport for [[gerrit:1249262{{!}}Disable MetricsPlatform extension (T416865)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:42 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1249262{{!}}Disable MetricsPlatform extension (T416865)]]
* 13:39 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248075{{!}}Confirmemail: Log delay between email sent and confirmation (T415902)]], [[gerrit:1247651{{!}}Enable confirmemail logstash channel (T415902)]] (duration: 11m 16s)
* 13:35 phuedx@deploy2002: mmartorana, phuedx: Continuing with sync
* 13:30 phuedx@deploy2002: mmartorana, phuedx: Backport for [[gerrit:1248075{{!}}Confirmemail: Log delay between email sent and confirmation (T415902)]], [[gerrit:1247651{{!}}Enable confirmemail logstash channel (T415902)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 13:28 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1248075{{!}}Confirmemail: Log delay between email sent and confirmation (T415902)]], [[gerrit:1247651{{!}}Enable confirmemail logstash channel (T415902)]]
* 13:10 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 13:04 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 12:55 moritzm: installing Kerberos security updates
* 12:29 moritzm: installing python3.9 security updates
* 12:11 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 12:00 reedy@deploy2002: Finished scap sync-world: Backport for [[gerrit:1239026{{!}}Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544)]], [[gerrit:1249253{{!}}CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled]] (duration: 06m 13s)
* 11:56 reedy@deploy2002: reedy: Continuing with sync
* 11:56 reedy@deploy2002: reedy: Backport for [[gerrit:1239026{{!}}Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544)]], [[gerrit:1249253{{!}}CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:54 reedy@deploy2002: Started scap sync-world: Backport for [[gerrit:1239026{{!}}Revert "CommonSettings: Temporarily set $wgOATHUserHandlesTable = true" (T416544)]], [[gerrit:1249253{{!}}CommonSettings: Remove orphaned $wgWebAuthnNewCredsDisabled]]
* 11:44 phuedx@deploy2002: Finished scap sync-world: Backport for [[gerrit:1249245{{!}}Hooks: Really only add global logging context for pageviews]] (duration: 12m 02s)
* 11:38 phuedx@deploy2002: phuedx: Continuing with sync
* 11:34 phuedx@deploy2002: phuedx: Backport for [[gerrit:1249245{{!}}Hooks: Really only add global logging context for pageviews]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:32 phuedx@deploy2002: Started scap sync-world: Backport for [[gerrit:1249245{{!}}Hooks: Really only add global logging context for pageviews]]
* 11:29 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 11:29 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 11:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 11:03 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:57 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:56 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:50 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:49 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:45 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus4003.ulsfo.wmnet
* 10:45 jmm@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:45 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host prometheus4003.ulsfo.wmnet with OS bookworm
* 10:44 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:44 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.ganeti.makevm: created new VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) prometheus4003.ulsfo.wmnet on all recursors
* 10:43 jmm@cumin2002: START - Cookbook sre.dns.wipe-cache prometheus4003.ulsfo.wmnet on all recursors
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:43 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:43 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add records for VM prometheus4003.ulsfo.wmnet - jmm@cumin2002"
* 10:40 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:39 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/aux-k8s-services/kafka-mirrormaker: apply
* 10:39 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:39 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host prometheus4003.ulsfo.wmnet
* 10:35 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:33 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 10:17 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 10:12 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:51 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 09:46 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:43 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:40 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.makevm (exit_code=99) for new host prometheus4003.ulsfo.wmnet
* 09:40 jmm@cumin2002: END (FAIL) - Cookbook sre.dns.netbox (exit_code=99)
* 09:35 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 09:35 jmm@cumin2002: START - Cookbook sre.ganeti.makevm for new host prometheus4003.ulsfo.wmnet
* 09:31 vriley@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host frdb1008
* 09:31 vriley@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host frdb1008
* 09:29 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 09:05 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 08:37 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wmde: apply
* 08:36 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wmde: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-wikidata: apply
* 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-test-k8s: apply
* 08:31 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-sre: apply
* 08:30 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-search: apply
* 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-search: apply
* 08:29 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-research: apply
* 08:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-research: apply
* 08:28 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-platform-eng: apply
* 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 08:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.addnode (exit_code=0) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 08:25 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-ml: apply
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 08:25 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-ml: apply
* 08:24 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 08:22 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-product: apply
* 08:21 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 08:16 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:16 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo02 and group 1
* 08:07 jmm@cumin2002: END (FAIL) - Cookbook sre.ganeti.addnode (exit_code=99) for new host ganeti4006.ulsfo.wmnet to cluster ulsfo and group 1
* 08:07 jmm@cumin2002: START - Cookbook sre.ganeti.addnode for new host ganeti4006.ulsfo.wmnet to cluster ulsfo and group 1
* 07:37 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248806{{!}}Add a script to send mandatory 2FA Echo notification (T419111)]], [[gerrit:1248821{{!}}Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880)]] (duration: 34m 41s)
* 07:23 mszwarc@deploy2002: mszwarc: Continuing with sync
* 07:22 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1248806{{!}}Add a script to send mandatory 2FA Echo notification (T419111)]], [[gerrit:1248821{{!}}Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 07:02 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1248806{{!}}Add a script to send mandatory 2FA Echo notification (T419111)]], [[gerrit:1248821{{!}}Set $wgOATH2FARequiredGroupRemovalPages for interface-admins (T417880)]]
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 58s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-08 ==
* 20:28 vgutierrez@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on acmechief-test2001.codfw.wmnet with reason: GTS issues
* 02:01 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 00m 59s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== 2026-03-07 ==
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 23s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:20 krinkle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248952{{!}}CSP: restore toolforge/wmcs entry in false positive list]] (duration: 10m 46s)
* 01:16 krinkle@deploy2002: krinkle: Continuing with sync
* 01:11 krinkle@deploy2002: krinkle: Backport for [[gerrit:1248952{{!}}CSP: restore toolforge/wmcs entry in false positive list]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:09 krinkle@deploy2002: Started scap sync-world: Backport for [[gerrit:1248952{{!}}CSP: restore toolforge/wmcs entry in false positive list]]
* 00:22 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
* 00:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cp2043.codfw.wmnet
* 00:05 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp2043.codfw.wmnet
== 2026-03-06 ==
* 23:29 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wdqs2009.codfw.wmnet with OS bullseye
* 23:13 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 23:07 ryankemper@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on wdqs2009.codfw.wmnet with reason: host reimage
* 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.hosts.move-vlan (exit_code=0) for host wdqs2009
* 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host wdqs2009
* 22:46 ryankemper@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host wdqs2009
* 22:46 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.wipe-cache (exit_code=0) wdqs2009.codfw.wmnet 141.0.192.10.in-addr.arpa 1.4.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:46 ryankemper@cumin2002: START - Cookbook sre.dns.wipe-cache wdqs2009.codfw.wmnet 141.0.192.10.in-addr.arpa 1.4.1.0.0.0.0.0.2.9.1.0.0.1.0.0.1.0.1.0.0.6.8.0.0.0.0.0.0.2.6.2.ip6.arpa on all recursors
* 22:45 ryankemper@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 22:45 ryankemper@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs2009 - ryankemper@cumin2002"
* 22:45 ryankemper@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Update records for host wdqs2009 - ryankemper@cumin2002"
* 22:41 ryankemper@cumin2002: START - Cookbook sre.dns.netbox
* 22:40 ryankemper@cumin2002: START - Cookbook sre.hosts.move-vlan for host wdqs2009
* 22:39 ryankemper@cumin2002: START - Cookbook sre.hosts.reimage for host wdqs2009.codfw.wmnet with OS bullseye
* 19:48 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:47 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:47 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:46 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 19:36 bking@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host wdqs2009.codfw.wmnet
* 19:23 bking@cumin2002: START - Cookbook sre.hosts.reboot-single for host wdqs2009.codfw.wmnet
* 19:17 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on wdqs2009.codfw.wmnet with reason: NFS might be hung, about to reboot
* 18:56 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2043.codfw.wmnet with reason: troubleshooting for network drops
* 18:44 brett@puppetserver1001: conftool action : set/pooled=no; selector: name=cp2043.*
* 18:29 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.decommission (exit_code=1) for hosts an-backup-datanode1033.eqiad.wmnet
* 18:29 btullis@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:29 btullis@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-backup-datanode1033.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
* 18:28 btullis@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: an-backup-datanode1033.eqiad.wmnet decommissioned, removing all IPs except the asset tag one - btullis@cumin1003"
* 17:59 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248858{{!}}cirrus: Use https for semanticsearch-test cluster]] (duration: 11m 20s)
* 17:53 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 17:52 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1248858{{!}}cirrus: Use https for semanticsearch-test cluster]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:51 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:51 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:47 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1248858{{!}}cirrus: Use https for semanticsearch-test cluster]]
* 17:42 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:42 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:40 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:40 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:11 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:11 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 17:10 btullis@cumin1003: START - Cookbook sre.dns.netbox
* 17:05 hashar@deploy2002: Finished deploy [gerrit/gerrit@b8183ba]: wm-checks-api: add tooltip to the CheckRun Run action (duration: 00m 13s)
* 17:05 hashar@deploy2002: Started deploy [gerrit/gerrit@b8183ba]: wm-checks-api: add tooltip to the CheckRun Run action
* 17:04 btullis@cumin1003: START - Cookbook sre.hosts.decommission for hosts an-backup-datanode1033.eqiad.wmnet
* 16:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 16:48 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 16:23 btullis@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 16:23 btullis@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 15:57 brouberol@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 15:57 brouberol@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 15:56 brouberol@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 15:56 brouberol@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 15:52 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2354-2356].codfw.wmnet
* 15:52 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2354-2356].codfw.wmnet
* 15:51 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2356.codfw.wmnet with OS trixie
* 15:46 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2355.codfw.wmnet with OS trixie
* 15:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2354.codfw.wmnet with OS trixie
* 15:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2356.codfw.wmnet with reason: host reimage
* 15:31 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-main: apply
* 15:30 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-main: apply
* 15:28 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 15:28 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-main: apply
* 15:28 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-main: apply
* 15:26 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-main: apply
* 15:26 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-main: apply
* 15:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2355.codfw.wmnet with reason: host reimage
* 15:24 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics-external: apply
* 15:23 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics-external: apply
* 15:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2354.codfw.wmnet with reason: host reimage
* 15:19 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics-external: apply
* 15:19 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics-external: apply
* 15:17 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics-external: apply
* 15:17 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics-external: apply
* 15:17 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2356.codfw.wmnet with reason: host reimage
* 15:16 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2355.codfw.wmnet with reason: host reimage
* 15:16 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2354.codfw.wmnet with reason: host reimage
* 15:15 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 15:10 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 15:09 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 15:08 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-analytics: apply
* 15:08 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-analytics: apply
* 15:06 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 15:05 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 15:05 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-analytics: apply
* 15:05 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-analytics: apply
* 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2356.codfw.wmnet with OS trixie
* 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2355.codfw.wmnet with OS trixie
* 15:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2354.codfw.wmnet with OS trixie
* 15:02 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-analytics: apply
* 15:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2348-2353].codfw.wmnet
* 15:02 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-analytics: apply
* 15:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2348-2353].codfw.wmnet
* 14:59 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2353.codfw.wmnet with OS trixie
* 14:57 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:57 tchin@deploy2002: helmfile [eqiad] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:56 tchin@deploy2002: helmfile [eqiad] START helmfile.d/services/eventgate-logging-external: apply
* 14:53 tchin@deploy2002: helmfile [codfw] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:52 tchin@deploy2002: helmfile [codfw] START helmfile.d/services/eventgate-logging-external: apply
* 14:52 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2351.codfw.wmnet with OS trixie
* 14:49 klausman@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 14:49 klausman@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 14:48 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2352.codfw.wmnet with OS trixie
* 14:48 klausman@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 14:48 klausman@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 14:48 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:47 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 14:45 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2350.codfw.wmnet with OS trixie
* 14:44 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 14:43 tchin@deploy2002: helmfile [staging] DONE helmfile.d/services/eventgate-logging-external: apply
* 14:43 tchin@deploy2002: helmfile [staging] START helmfile.d/services/eventgate-logging-external: apply
* 14:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:41 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2353.codfw.wmnet with reason: host reimage
* 14:37 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2349.codfw.wmnet with reason: host reimage
* 14:33 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2351.codfw.wmnet with reason: host reimage
* 14:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2352.codfw.wmnet with reason: host reimage
* 14:29 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:28 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2350.codfw.wmnet with reason: host reimage
* 14:23 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2348.codfw.wmnet with reason: host reimage
* 14:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2351.codfw.wmnet with reason: host reimage
* 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2352.codfw.wmnet with reason: host reimage
* 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2353.codfw.wmnet with reason: host reimage
* 14:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2350.codfw.wmnet with reason: host reimage
* 14:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2349.codfw.wmnet with reason: host reimage
* 14:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2348.codfw.wmnet with reason: host reimage
* 14:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2353.codfw.wmnet with OS trixie
* 14:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2352.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2351.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2350.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2349.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=93) for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:03 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2348.codfw.wmnet with OS trixie
* 14:02 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2342-2347].codfw.wmnet
* 14:02 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2342-2347].codfw.wmnet
* 14:01 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2347.codfw.wmnet with OS trixie
* 13:57 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2346.codfw.wmnet with OS trixie
* 13:55 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2343.codfw.wmnet with OS trixie
* 13:50 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2345.codfw.wmnet with OS trixie
* 13:48 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2344.codfw.wmnet with OS trixie
* 13:45 dreamyjazz@deploy2002: mwscript-k8s job started: foreachwikiindblist checkuser-suggested-investigations CheckUser:queueAutoCloseSICases.php # [[phab:T418591|T418591]]
* 13:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2342.codfw.wmnet with OS trixie
* 13:42 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2347.codfw.wmnet with reason: host reimage
* 13:38 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2346.codfw.wmnet with reason: host reimage
* 13:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2343.codfw.wmnet with reason: host reimage
* 13:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2345.codfw.wmnet with reason: host reimage
* 13:28 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2344.codfw.wmnet with reason: host reimage
* 13:24 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2342.codfw.wmnet with reason: host reimage
* 13:21 Dreamy_Jazz: Running foreachwikiindblist checkuser-suggested-investigations.dblist ~/PopulateSiuInfo.php --batch-size=1000 for [[phab:T411118|T411118]]
* 13:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2347.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2346.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2345.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2344.codfw.wmnet with reason: host reimage
* 13:20 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2343.codfw.wmnet with reason: host reimage
* 13:19 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2342.codfw.wmnet with reason: host reimage
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2347.codfw.wmnet with OS trixie
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2346.codfw.wmnet with OS trixie
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2345.codfw.wmnet with OS trixie
* 13:07 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2344.codfw.wmnet with OS trixie
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2343.codfw.wmnet with OS trixie
* 13:06 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2342.codfw.wmnet with OS trixie
* 13:05 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2336-2341].codfw.wmnet
* 13:05 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2336-2341].codfw.wmnet
* 13:01 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2341.codfw.wmnet with OS trixie
* 12:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2340.codfw.wmnet with OS trixie
* 12:49 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2337.codfw.wmnet with OS trixie
* 12:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2338.codfw.wmnet with OS trixie
* 12:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2336.codfw.wmnet with OS trixie
* 12:40 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2341.codfw.wmnet with reason: host reimage
* 12:35 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2339.codfw.wmnet with OS trixie
* 12:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2340.codfw.wmnet with reason: host reimage
* 12:31 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2337.codfw.wmnet with reason: host reimage
* 12:26 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2338.codfw.wmnet with reason: host reimage
* 12:22 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2336.codfw.wmnet with reason: host reimage
* 12:18 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2339.codfw.wmnet with reason: host reimage
* 12:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2340.codfw.wmnet with reason: host reimage
* 12:13 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2341.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2337.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2338.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2336.codfw.wmnet with reason: host reimage
* 12:12 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2339.codfw.wmnet with reason: host reimage
* 12:00 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2341.codfw.wmnet with OS trixie
* 12:00 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2340.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2339.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2338.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2337.codfw.wmnet with OS trixie
* 11:59 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2336.codfw.wmnet with OS trixie
* 11:56 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2333-2335].codfw.wmnet
* 11:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2333-2335].codfw.wmnet
* 11:55 elukey@cumin1003: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 11:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1207.eqiad.wmnet
* 11:54 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2335.codfw.wmnet with OS trixie
* 11:53 moritzm: uploaded icu 72.1-3+deb12u1~wmf11u1 to component/php83-icu72 [[phab:T419058|T419058]] (backport of ICU 72 from Bookworm to Bullseye, built to be co-installable with the native ICU from Bullseye)
* 11:50 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2334.codfw.wmnet with OS trixie
* 11:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1207.eqiad.wmnet
* 11:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1205.eqiad.wmnet
* 11:45 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2333.codfw.wmnet with OS trixie
* 11:39 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 11:39 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1205.eqiad.wmnet
* 11:34 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2335.codfw.wmnet with reason: host reimage
* 11:30 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2334.codfw.wmnet with reason: host reimage
* 11:27 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2333.codfw.wmnet with reason: host reimage
* 11:23 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2335.codfw.wmnet with reason: host reimage
* 11:22 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2334.codfw.wmnet with reason: host reimage
* 11:21 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2333.codfw.wmnet with reason: host reimage
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2334.codfw.wmnet with OS trixie
* 11:09 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2335.codfw.wmnet with OS trixie
* 11:08 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2333.codfw.wmnet with OS trixie
* 11:06 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2332.codfw.wmnet
* 11:05 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2332.codfw.wmnet
* 11:02 blake@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker2332.codfw.wmnet with OS trixie
* 10:43 blake@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker2332.codfw.wmnet with reason: host reimage
* 10:36 blake@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker2332.codfw.wmnet with reason: host reimage
* 10:23 blake@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker2332.codfw.wmnet with OS trixie
* 10:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1199.eqiad.wmnet
* 10:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1194.eqiad.wmnet
* 10:16 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker[2332-2356].codfw.wmnet
* 10:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1194.eqiad.wmnet
* 10:09 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
* 10:09 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 10:03 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker[2332-2356].codfw.wmnet
* 09:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 09:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 09:39 Emperor: repool ms-fe1013 after PXE work [[phab:T401966|T401966]]
* 09:23 derick@deploy2002: mwscript-k8s job started: extensions/CentralAuth/maintenance/fixStuckGlobalRename.php --wiki=pmswiki --logwiki=metawiki Wikilimes Limes.pink # [[phab:T419184|T419184]]
* 09:10 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:09 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:08 elukey@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 09:08 elukey@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-fe1013.eqiad.wmnet
* 08:57 elukey@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-fe1013.eqiad.wmnet
* 08:56 elukey@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 08:54 elukey@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 08:42 elukey@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ms-fe1013.eqiad.wmnet
* 08:25 moritzm: uploaded openjdk-8 8u482-ga-1~deb12u1 to component/jdk8 of bookworm-wikimedia
* 08:11 moritzm: imported prometheus-ganeti-exporter 0.3+deb12u2 for bookworm-wikimedia [[phab:T419166|T419166]]
* 06:23 ryankemper@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikidata-query-gui: apply
* 06:23 ryankemper@deploy2002: helmfile [codfw] START helmfile.d/services/wikidata-query-gui: apply
* 06:23 ryankemper@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikidata-query-gui: apply
* 06:23 ryankemper@deploy2002: helmfile [eqiad] START helmfile.d/services/wikidata-query-gui: apply
* 06:22 ryankemper@deploy2002: helmfile [staging] DONE helmfile.d/services/wikidata-query-gui: apply
* 06:22 ryankemper@deploy2002: helmfile [staging] START helmfile.d/services/wikidata-query-gui: apply
* 02:59 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 02:59 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 02:59 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 02:56 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 02:21 zabe: zabe@deploy2002:/srv/mediawiki-staging$ foreachwiki extensions/TimedMediaHandler/maintenance/migrateTranscodeStates.php --force # [[phab:T415064|T415064]]
* 02:16 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248658{{!}}Update interwiki cache]] (duration: 06m 38s)
* 02:12 zabe@deploy2002: mwscript-k8s job started: foreachwikiindblist wikidataclient extensions/Wikibase/lib/maintenance/populateSitesTable.php --force-protocol https # [[phab:T415978|T415978]], [[phab:T414241|T414241]]
* 02:12 zabe@deploy2002: zabe: Continuing with sync
* 02:11 zabe@deploy2002: zabe: Backport for [[gerrit:1248658{{!}}Update interwiki cache]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 02:09 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248658{{!}}Update interwiki cache]]
* 02:09 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 23s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:59 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248656{{!}}Set urwikisource to rtl (T415960)]] (duration: 06m 39s)
* 01:55 zabe@deploy2002: zabe: Continuing with sync
* 01:54 zabe@deploy2002: zabe: Backport for [[gerrit:1248656{{!}}Set urwikisource to rtl (T415960)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:53 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248656{{!}}Set urwikisource to rtl (T415960)]]
* 01:45 zabe@deploy2002: Sync cancelled.
* 01:43 zabe@deploy2002: zabe: Backport for [[gerrit:1248653{{!}}Activate urwikisource (T415960)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:42 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248653{{!}}Activate urwikisource (T415960)]]
* 01:38 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248652{{!}}Prepare urwikisource (T415960)]] (duration: 06m 18s)
* 01:34 zabe@deploy2002: zabe: Continuing with sync
* 01:34 zabe@deploy2002: zabe: Backport for [[gerrit:1248652{{!}}Prepare urwikisource (T415960)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:32 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248652{{!}}Prepare urwikisource (T415960)]]
* 01:29 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248650{{!}}Activate kaiwiki (T414234)]] (duration: 06m 57s)
* 01:25 zabe@deploy2002: zabe: Continuing with sync
* 01:24 zabe@deploy2002: zabe: Backport for [[gerrit:1248650{{!}}Activate kaiwiki (T414234)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:22 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248650{{!}}Activate kaiwiki (T414234)]]
* 01:17 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248647{{!}}Prepare kaiwiki (T414234)]] (duration: 07m 25s)
* 01:13 zabe@deploy2002: zabe: Continuing with sync
* 01:11 zabe@deploy2002: zabe: Backport for [[gerrit:1248647{{!}}Prepare kaiwiki (T414234)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:09 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248647{{!}}Prepare kaiwiki (T414234)]]
* 00:33 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248493{{!}}Stop writing to il_to on all wikis except commons (T415787)]] (duration: 06m 22s)
* 00:29 zabe@deploy2002: zabe: Continuing with sync
* 00:28 zabe@deploy2002: zabe: Backport for [[gerrit:1248493{{!}}Stop writing to il_to on all wikis except commons (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:27 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248493{{!}}Stop writing to il_to on all wikis except commons (T415787)]]
* 00:05 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248628{{!}}Re-enable AllowUserJs (T419137)]] (duration: 08m 08s)
* 00:01 catrope@deploy2002: catrope, kharlan: Continuing with sync
== 2026-03-05 ==
* 23:58 catrope@deploy2002: catrope, kharlan: Backport for [[gerrit:1248628{{!}}Re-enable AllowUserJs (T419137)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:56 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1248628{{!}}Re-enable AllowUserJs (T419137)]]
* 23:52 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248636{{!}}CSP: Update false positives list]] (duration: 06m 34s)
* 23:52 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2003.wikimedia.org with OS trixie
* 23:47 catrope@deploy2002: catrope: Continuing with sync
* 23:47 catrope@deploy2002: catrope: Backport for [[gerrit:1248636{{!}}CSP: Update false positives list]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:45 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1248636{{!}}CSP: Update false positives list]]
* 23:33 dzahn@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on contint2003.wikimedia.org with reason: host reimage
* 23:29 dzahn@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on contint2003.wikimedia.org with reason: host reimage
* 23:15 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1238028{{!}}Using Hadoop for MostTranscludedPages on commonswiki (T416927)]] (duration: 06m 27s)
* 23:11 zabe@deploy2002: zabe: Continuing with sync
* 23:10 zabe@deploy2002: zabe: Backport for [[gerrit:1238028{{!}}Using Hadoop for MostTranscludedPages on commonswiki (T416927)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:09 dzahn@cumin2002: START - Cookbook sre.hosts.reimage for host contint2003.wikimedia.org with OS trixie
* 23:08 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1238028{{!}}Using Hadoop for MostTranscludedPages on commonswiki (T416927)]]
* 22:45 maryum: Deployed security fix for [[phab:T418254|T418254]]
* 22:35 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248483{{!}}SpecialWantedFiles: Use lt_title instead of lt_to (T299953)]] (duration: 06m 12s)
* 22:31 zabe@deploy2002: zabe: Continuing with sync
* 22:30 zabe@deploy2002: zabe: Backport for [[gerrit:1248483{{!}}SpecialWantedFiles: Use lt_title instead of lt_to (T299953)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:28 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248483{{!}}SpecialWantedFiles: Use lt_title instead of lt_to (T299953)]]
* 21:43 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248508{{!}}cirrus: Align semanticsearch cluster group name with routing (T413969)]] (duration: 07m 20s)
* 21:39 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 21:38 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1248508{{!}}cirrus: Align semanticsearch cluster group name with routing (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:36 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1248508{{!}}cirrus: Align semanticsearch cluster group name with routing (T413969)]]
* 21:04 jhathaway@dns1004: END - running authdns-update
* 21:02 jhathaway@dns1004: START - running authdns-update
* 20:53 jasmine@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:52 jasmine@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new service IPs for sophroid - jasmine@cumin2002"
* 20:52 jasmine@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add new service IPs for sophroid - jasmine@cumin2002"
* 20:47 jasmine@cumin2002: START - Cookbook sre.dns.netbox
* 20:28 cdanis: apt built and imported jwt-authorizer 1.3.0-1
* 20:16 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group2 to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 20:04 krinkle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248574{{!}}Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475)]] (duration: 07m 37s)
* 20:00 krinkle@deploy2002: krinkle: Continuing with sync
* 19:58 krinkle@deploy2002: krinkle: Backport for [[gerrit:1248574{{!}}Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:56 krinkle@deploy2002: Started scap sync-world: Backport for [[gerrit:1248574{{!}}Allow toolforge APIs in enforced CSP mode (T135963 T419137 T220475)]]
* 19:21 sbassett@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248571{{!}}Re-enable Site JS (T419137 T419138)]] (duration: 06m 57s)
* 19:17 sbassett@deploy2002: sbassett: Continuing with sync
* 19:16 sbassett@deploy2002: sbassett: Backport for [[gerrit:1248571{{!}}Re-enable Site JS (T419137 T419138)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:15 sbassett@deploy2002: Started scap sync-world: Backport for [[gerrit:1248571{{!}}Re-enable Site JS (T419137 T419138)]]
* 19:04 dr0ptp4kt: Deploying change {{Gerrit|1239200}} for refinery ( [[phab:T416481|T416481]] ) using scap, then deployed onto hdfs
* 19:03 dr0ptp4kt: Deployed refinery change {{Gerrit|1240253}} ( [[phab:T414478|T414478]] ), {{Gerrit|1240253}} (no-op) for refinery ( [[phab:T414478|T414478]] ) using scap, then deployed onto hdfs
* 18:58 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1] (thin): Regular analytics weekly train THIN [analytics/refinery@dd641b15] (duration: 02m 02s)
* 18:56 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1] (thin): Regular analytics weekly train THIN [analytics/refinery@dd641b15]
* 18:55 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1]: Regular analytics weekly train [analytics/refinery@dd641b15] (duration: 04m 18s)
* 18:50 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1]: Regular analytics weekly train [analytics/refinery@dd641b15]
* 18:49 dr0ptp4kt@deploy2002: Finished deploy [analytics/refinery@dd641b1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@dd641b15] (duration: 01m 57s)
* 18:47 dr0ptp4kt: Deploying change {{Gerrit|1239200}} for refinery ( [[phab:T416481|T416481]] )
* 18:47 dr0ptp4kt@deploy2002: Started deploy [analytics/refinery@dd641b1] (hadoop-test): Regular analytics weekly train TEST [analytics/refinery@dd641b15]
* 18:31 eevans@dns1004: END - running authdns-update
* 18:30 eevans@dns1004: START - running authdns-update
* 18:30 sukhe: sudo cumin -b51 "A:cp" "run-puppet-agent --enable 'rolling out 1248544'"
* 18:16 sukhe: sudo cumin "A:cp" "disable-puppet 'rolling out 1248544'"
* 18:06 pt1979@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 18:06 pt1979@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 18:06 pt1979@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Change frack mgmt vlan interface - pt1979@cumin2002"
* 18:02 pt1979@cumin2002: START - Cookbook sre.dns.netbox
* 17:31 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248536{{!}}Enable wgUseSiteJs on donatewiki (T419138)]] (duration: 09m 57s)
* 17:27 mszwarc@deploy2002: mszwarc, krinkle: Continuing with sync
* 17:26 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host contint2003.wikimedia.org with OS bookworm
* 17:26 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:23 mszwarc@deploy2002: mszwarc, krinkle: Backport for [[gerrit:1248536{{!}}Enable wgUseSiteJs on donatewiki (T419138)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:21 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1248536{{!}}Enable wgUseSiteJs on donatewiki (T419138)]]
* 17:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1178.eqiad.wmnet
* 17:16 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 17:12 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker1162.eqiad.wmnet
* 17:12 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker1162.eqiad.wmnet
* 17:10 cgoubert@cumin1003: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for wikikube-worker1162.eqiad.wmnet
* 17:10 cgoubert@cumin1003: START - Cookbook sre.hosts.remove-downtime for wikikube-worker1162.eqiad.wmnet
* 17:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 17:05 taavi@cumin1003: dbctl commit (dc=all): 'enable writes', diff saved to https://phabricator.wikimedia.org/P89812 and previous config saved to /var/cache/conftool/dbconfig/20260305-170556-taavi.json
* 16:03 oblivian@cumin1003: dbctl commit (dc=all): 'read only s6', diff saved to https://phabricator.wikimedia.org/P89810 and previous config saved to /var/cache/conftool/dbconfig/20260305-160348-oblivian.json
* 15:32 taavi@cumin1003: dbctl commit (dc=all): 'set global ro', diff saved to https://phabricator.wikimedia.org/P89808 and previous config saved to /var/cache/conftool/dbconfig/20260305-153203-taavi.json
* 15:31 mszwarc@deploy2002: mszwarc: Continuing with sync
* 15:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1178.eqiad.wmnet
* 15:31 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1248509{{!}}Disable custom JS for a moment]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:29 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1248509{{!}}Disable custom JS for a moment]]
* 15:25 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts ['contint2003']
* 15:25 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['contint2003']
* 15:23 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248506{{!}}cirrus: Correct semantic builder config (T413969)]] (duration: 07m 39s)
* 15:22 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:19 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 15:18 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1248506{{!}}cirrus: Correct semantic builder config (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:16 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1248506{{!}}cirrus: Correct semantic builder config (T413969)]]
* 15:11 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=0) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 15:10 ebernhardson@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244713{{!}}cirrus: Add semantic search test cluster (T413969)]] (duration: 09m 18s)
* 15:06 ebernhardson@deploy2002: ebernhardson: Continuing with sync
* 15:04 sukhe@dns1004: END - running authdns-update
* 15:03 sukhe@dns1004: START - running authdns-update
* 15:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 15:02 ebernhardson@deploy2002: ebernhardson: Backport for [[gerrit:1244713{{!}}cirrus: Add semantic search test cluster (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:02 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
* 15:02 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
* 15:00 ebernhardson@deploy2002: Started scap sync-world: Backport for [[gerrit:1244713{{!}}cirrus: Add semantic search test cluster (T413969)]]
* 14:57 btullis@cumin1003: START - Cookbook sre.hosts.reimage for host dse-k8s-worker1010.eqiad.wmnet with OS bookworm
* 14:53 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 14:50 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 14:38 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
* 14:38 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
* 14:32 btullis@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host dse-k8s-worker1010
* 14:32 btullis@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host dse-k8s-worker1010
* 14:32 sukhe@dns1004: END - running authdns-update
* 14:30 sukhe@dns1004: START - running authdns-update
* 14:28 elukey@cumin1003: END (FAIL) - Cookbook sre.kafka.change-confluent-distro-version (exit_code=99) Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 14:28 sukhe@dns1004: START - running authdns-update
* 14:27 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1231.eqiad.wmnet
* 14:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1230.eqiad.wmnet
* 14:24 bking@dns1004: START - running authdns-update
* 14:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1230.eqiad.wmnet
* 14:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1229.eqiad.wmnet
* 14:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 14:09 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 14:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 14:05 moritzm: imported nodejs 24.14.0-1nodesource1 to thirdparty/node24 [[phab:T418440|T418440]]
* 14:03 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1229.eqiad.wmnet
* 14:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1228.eqiad.wmnet
* 14:01 moritzm: initialised ganeti02/ulsfo cluster [[phab:T418993|T418993]]
* 13:57 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:56 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:55 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:53 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:52 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1228.eqiad.wmnet
* 13:52 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1227.eqiad.wmnet
* 13:52 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:51 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:50 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:47 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:46 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:43 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:42 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1199.eqiad.wmnet
* 13:40 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1227.eqiad.wmnet
* 13:40 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1226.eqiad.wmnet
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 13:38 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'llm' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:37 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:35 moritzm: installing glib2.0 security updates
* 13:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 13:33 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:26 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1226.eqiad.wmnet
* 13:26 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1225.eqiad.wmnet
* 13:26 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 13:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1225.eqiad.wmnet
* 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1224.eqiad.wmnet
* 13:09 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4006.ulsfo.wmnet with OS bookworm
* 13:08 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:07 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:07 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new VIP for routed ganeti in ulsfo - jmm@cumin2002"
* 13:06 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: add new VIP for routed ganeti in ulsfo - jmm@cumin2002"
* 13:06 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 13:05 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 13:02 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 13:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1224.eqiad.wmnet
* 13:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1223.eqiad.wmnet
* 13:00 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 12:59 dpogorzelski@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 12:58 cgoubert@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 14 days, 0:00:00 on wikikube-worker1162.eqiad.wmnet with reason: dcops intervention
* 12:57 cgoubert@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host wikikube-worker1162.eqiad.wmnet
* 12:56 cgoubert@cumin1003: START - Cookbook sre.k8s.pool-depool-node depool for host wikikube-worker1162.eqiad.wmnet
* 12:55 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 12:50 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 12:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1223.eqiad.wmnet
* 12:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1222.eqiad.wmnet
* 12:46 elukey@cumin1003: START - Cookbook sre.kafka.change-confluent-distro-version Change Confluent distribution for Kafka A:kafka-test-eqiad cluster: Change Confluent distribution.
* 12:43 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4006.ulsfo.wmnet with reason: host reimage
* 12:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1222.eqiad.wmnet
* 12:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1221.eqiad.wmnet
* 12:23 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1221.eqiad.wmnet
* 12:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1220.eqiad.wmnet
* 12:23 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4006.ulsfo.wmnet with OS bookworm
* 12:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1220.eqiad.wmnet
* 11:53 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti4005.ulsfo.wmnet
* 11:44 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti4005.ulsfo.wmnet
* 11:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1236.eqiad.wmnet
* 11:29 moritzm: remove ganeti4006 from ganeti/ulsfo cluster [[phab:T418993|T418993]]
* 11:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1236.eqiad.wmnet
* 11:25 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1235.eqiad.wmnet
* 11:16 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1235.eqiad.wmnet
* 11:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1234.eqiad.wmnet
* 11:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1234.eqiad.wmnet
* 11:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1233.eqiad.wmnet
* 11:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1233.eqiad.wmnet
* 11:02 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1232.eqiad.wmnet
* 11:02 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 11:00 elukey@cumin1003: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
* 10:59 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ganeti4005.ulsfo.wmnet with OS bookworm
* 10:55 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 10:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1232.eqiad.wmnet
* 10:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1231.eqiad.wmnet
* 10:54 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-analytics-test: apply
* 10:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1231.eqiad.wmnet
* 10:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1230.eqiad.wmnet
* 10:41 elukey@cumin1003: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
* 10:39 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ganeti4005.ulsfo.wmnet with reason: host reimage
* 10:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1230.eqiad.wmnet
* 10:37 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1229.eqiad.wmnet
* 10:32 jmm@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ganeti4005.ulsfo.wmnet with reason: host reimage
* 10:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1229.eqiad.wmnet
* 10:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1228.eqiad.wmnet
* 10:24 moritzm: installing Java 8 security updates
* 10:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1228.eqiad.wmnet
* 10:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1227.eqiad.wmnet
* 10:15 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1227.eqiad.wmnet
* 10:14 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1226.eqiad.wmnet
* 10:11 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 10:11 jmm@cumin2002: START - Cookbook sre.hosts.reimage for host ganeti4005.ulsfo.wmnet with OS bookworm
* 10:10 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts ganeti4005.ulsfo.wmnet
* 10:08 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ganeti4005.ulsfo.wmnet
* 10:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 10:08 ayounsi@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add gw-virtual.ulsfo.wmnet - ayounsi@cumin1003"
* 10:07 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1226.eqiad.wmnet
* 10:07 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1225.eqiad.wmnet
* 09:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1225.eqiad.wmnet
* 09:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1224.eqiad.wmnet
* 09:52 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1224.eqiad.wmnet
* 09:51 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1223.eqiad.wmnet
* 09:44 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1223.eqiad.wmnet
* 09:44 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1222.eqiad.wmnet
* 09:43 ayounsi@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: Add gw-virtual.ulsfo.wmnet - ayounsi@cumin1003"
* 09:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1222.eqiad.wmnet
* 09:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1221.eqiad.wmnet
* 09:32 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:32 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:28 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1221.eqiad.wmnet
* 09:27 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1220.eqiad.wmnet
* 09:20 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1220.eqiad.wmnet
* 09:02 ayounsi@cumin1003: START - Cookbook sre.dns.netbox
* 08:40 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 08:38 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247990{{!}}Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580)]] (duration: 07m 07s)
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 08:35 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/dse-k8s-services/airflow-main: apply
* 08:34 mszwarc@deploy2002: mszwarc: Continuing with sync
* 08:33 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1247990{{!}}Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:30 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1247990{{!}}Drop 'centralnoticeadmin' from $wgOATHRequiredForGroups (T418580)]]
* 08:29 gehel@dns1004: END - running authdns-update
* 08:28 gehel@dns1004: START - running authdns-update
* 08:27 moritzm: installing mbedtls security updates
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/airflow-main: apply
* 08:23 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/airflow-main: apply
* 08:15 hashar@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248314{{!}}Revert "zhwiki: Add 2026 CNY celebration logos"]] (duration: 09m 19s)
* 08:11 hashar@deploy2002: hashar, stang: Continuing with sync
* 08:08 hashar@deploy2002: hashar, stang: Backport for [[gerrit:1248314{{!}}Revert "zhwiki: Add 2026 CNY celebration logos"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:06 hashar@deploy2002: Started scap sync-world: Backport for [[gerrit:1248314{{!}}Revert "zhwiki: Add 2026 CNY celebration logos"]]
* 08:02 moritzm: uploaded openjdk-8 8u482-ga-1~deb11u1 to component/jdk8 of bullseye-wikimedia
* 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.decommission (exit_code=0) for hosts bast4005.wikimedia.org
* 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 07:56 jmm@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast4005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:55 jmm@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: bast4005.wikimedia.org decommissioned, removing all IPs except the asset tag one - jmm@cumin2002"
* 07:48 moritzm: uploaded bird2 2.18-1~wmf13u2 to the main component of trixie-wikimedia [[phab:T413740|T413740]]
* 07:47 jmm@cumin2002: START - Cookbook sre.dns.netbox
* 07:47 kevinbazira@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'llm' for release 'main' .
* 07:42 jmm@cumin2002: START - Cookbook sre.hosts.decommission for hosts bast4005.wikimedia.org
* 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Remove es1033 [[phab:T408772|T408772]]', diff saved to https://phabricator.wikimedia.org/P89804 and previous config saved to /var/cache/conftool/dbconfig/20260305-063548-marostegui.json
* 02:10 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 07m 55s)
* 02:02 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 02:01 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248163{{!}}Stop writing to il_to on medium size wikis (T415787)]] (duration: 06m 14s)
* 01:58 zabe@deploy2002: zabe: Continuing with sync
* 01:57 zabe@deploy2002: zabe: Backport for [[gerrit:1248163{{!}}Stop writing to il_to on medium size wikis (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:55 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248163{{!}}Stop writing to il_to on medium size wikis (T415787)]]
* 01:40 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1246099{{!}}Start reading from new file tables on medium wikis (T416548)]] (duration: 06m 15s)
* 01:36 zabe@deploy2002: zabe: Continuing with sync
* 01:36 zabe@deploy2002: zabe: Backport for [[gerrit:1246099{{!}}Start reading from new file tables on medium wikis (T416548)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:34 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1246099{{!}}Start reading from new file tables on medium wikis (T416548)]]
* 01:29 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248154{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]], [[gerrit:1248153{{!}}Revert^2 "ImageListPager: Properly support file schema migration read new"]] (duration: 07m 21s)
* 01:25 zabe@deploy2002: zabe: Continuing with sync
* 01:23 zabe@deploy2002: zabe: Backport for [[gerrit:1248154{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]], [[gerrit:1248153{{!}}Revert^2 "ImageListPager: Properly support file schema migration read new"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 01:21 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248154{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]], [[gerrit:1248153{{!}}Revert^2 "ImageListPager: Properly support file schema migration read new"]]
* 00:55 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248021{{!}}Stop writing to il_to on small wikis (T415787)]] (duration: 06m 49s)
* 00:51 zabe@deploy2002: zabe: Continuing with sync
* 00:50 zabe@deploy2002: zabe: Backport for [[gerrit:1248021{{!}}Stop writing to il_to on small wikis (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:48 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248021{{!}}Stop writing to il_to on small wikis (T415787)]]
* 00:19 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248125{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]], [[gerrit:1248123{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]] (duration: 08m 52s)
* 00:13 zabe@deploy2002: zabe: Continuing with sync
* 00:12 zabe@deploy2002: zabe: Backport for [[gerrit:1248125{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]], [[gerrit:1248123{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:10 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1248125{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]], [[gerrit:1248123{{!}}NewFilesPager: Properly support file schema migration read new (T419062)]]
== 2026-03-04 ==
* 22:57 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 22:56 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 22:55 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 22:55 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 22:55 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 22:54 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 22:35 tgr_: UTC late deploys done
* 22:33 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248084{{!}}Introduce a Semantic Search query route and builder (T413969)]], [[gerrit:1248085{{!}}Wire up semantic query building (T413969)]] (duration: 38m 28s)
* 22:16 tgr@deploy2002: tgr, ebernhardson: Continuing with sync
* 22:14 tgr@deploy2002: tgr, ebernhardson: Backport for [[gerrit:1248084{{!}}Introduce a Semantic Search query route and builder (T413969)]], [[gerrit:1248085{{!}}Wire up semantic query building (T413969)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248084{{!}}Introduce a Semantic Search query route and builder (T413969)]], [[gerrit:1248085{{!}}Wire up semantic query building (T413969)]]
* 21:48 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248012{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999)]] (duration: 07m 05s)
* 21:47 bking@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5 days, 0:00:00 on dse-k8s-worker1028.eqiad.wmnet with reason: broken networking
* 21:44 tgr@deploy2002: tgr: Continuing with sync
* 21:43 tgr@deploy2002: tgr: Backport for [[gerrit:1248012{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:40 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248012{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #3) (T415007 T418999)]]
* 21:36 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248081{{!}}Add synthetic AAA experiment (T418614)]], [[gerrit:1248080{{!}}Add synthetic AAA experiment (T418614)]] (duration: 09m 11s)
* 21:35 bking@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
* 21:32 tgr@deploy2002: cjming, tgr: Continuing with sync
* 21:30 bking@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
* 21:29 tgr@deploy2002: cjming, tgr: Backport for [[gerrit:1248081{{!}}Add synthetic AAA experiment (T418614)]], [[gerrit:1248080{{!}}Add synthetic AAA experiment (T418614)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:27 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248081{{!}}Add synthetic AAA experiment (T418614)]], [[gerrit:1248080{{!}}Add synthetic AAA experiment (T418614)]]
* 21:21 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1245473{{!}}logging: set poolcounter channel log level to info (T418612)]] (duration: 09m 04s)
* 21:17 tgr@deploy2002: tgr, cwhite: Continuing with sync
* 21:14 tgr@deploy2002: tgr, cwhite: Backport for [[gerrit:1245473{{!}}logging: set poolcounter channel log level to info (T418612)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:12 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1245473{{!}}logging: set poolcounter channel log level to info (T418612)]]
* 21:07 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248007{{!}}Fix $wgJwtSessionCookieIssuer (T415007 T418999)]] (duration: 09m 55s)
* 21:03 tgr@deploy2002: tgr: Continuing with sync
* 20:59 tgr@deploy2002: tgr: Backport for [[gerrit:1248007{{!}}Fix $wgJwtSessionCookieIssuer (T415007 T418999)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 20:57 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248007{{!}}Fix $wgJwtSessionCookieIssuer (T415007 T418999)]]
* 19:56 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group1 to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 19:44 jhuneidi@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248011{{!}}CategoryViewer: Fall back to empty string in case of missing nextpage (T418934)]] (duration: 10m 47s)
* 19:44 cdobbins@cumin2002: conftool action : set/pooled=yes:weight=1; selector: name=cp205[0-8].codfw.wmnet
* 19:43 cdobbins@cumin2002: conftool action : set/pooled=yes:weight=1; selector: name=cp2049.codfw.wmnet
* 19:40 jhuneidi@deploy2002: zabe, jhuneidi: Continuing with sync
* 19:35 jhuneidi@deploy2002: zabe, jhuneidi: Backport for [[gerrit:1248011{{!}}CategoryViewer: Fall back to empty string in case of missing nextpage (T418934)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 19:34 brett@puppetserver1001: conftool action : set/weight=1; selector: name=cp2043.*
* 19:34 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp2043.*
* 19:33 jhuneidi@deploy2002: Started scap sync-world: Backport for [[gerrit:1248011{{!}}CategoryViewer: Fall back to empty string in case of missing nextpage (T418934)]]
* 19:30 brett@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2043.codfw.wmnet with OS trixie
* 19:23 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 19:22 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 19:22 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 19:22 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 19:09 brett@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2043.codfw.wmnet with reason: host reimage
* 19:06 brett@puppetserver1001: conftool action : set/weight=1; selector: name=cp204[45678].*
* 19:04 brett@puppetserver1001: conftool action : set/weight=100; selector: name=cp204[45678].*
* 19:02 brett@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2043.codfw.wmnet with reason: host reimage
* 18:58 brett@puppetserver1001: conftool action : set/pooled=yes; selector: name=cp204[45678].*
* 18:52 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-video: apply
* 18:51 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-video: apply
* 18:50 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-timeline: apply
* 18:50 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-timeline: apply
* 18:49 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:49 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:49 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-media: apply
* 18:48 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-media: apply
* 18:48 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox-constraints: apply
* 18:47 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox-constraints: apply
* 18:47 swfrench@deploy2002: helmfile [codfw] DONE helmfile.d/services/shellbox: apply
* 18:47 brett@cumin2002: START - Cookbook sre.hosts.reimage for host cp2043.codfw.wmnet with OS trixie
* 18:46 swfrench@deploy2002: helmfile [codfw] START helmfile.d/services/shellbox: apply
* 18:43 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-video: apply
* 18:42 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-video: apply
* 18:41 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-timeline: apply
* 18:41 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-timeline: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-media: apply
* 18:40 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-media: apply
* 18:39 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-constraints: apply
* 18:39 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-constraints: apply
* 18:38 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox: apply
* 18:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox: apply
* 18:37 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:37 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:32 swfrench@deploy2002: helmfile [eqiad] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:32 swfrench@deploy2002: helmfile [eqiad] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:16 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-video: apply
* 18:16 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-video: apply
* 18:16 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-timeline: apply
* 18:15 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-timeline: apply
* 18:15 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:14 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-syntaxhighlight: apply
* 18:14 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-media: apply
* 18:13 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox-constraints: apply
* 18:13 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox-constraints: apply
* 18:12 swfrench@deploy2002: helmfile [staging] DONE helmfile.d/services/shellbox: apply
* 18:12 swfrench@deploy2002: helmfile [staging] START helmfile.d/services/shellbox: apply
* 17:54 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2047.codfw.wmnet with OS trixie
* 17:33 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 17:27 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 17:23 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 17:23 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:18 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:18 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:15 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:13 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:09 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp2047.codfw.wmnet with OS trixie
* 16:55 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:55 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:54 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 16:54 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 16:39 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host aux-k8s-worker1007.eqiad.wmnet with OS bookworm
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-unlock-scap (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:39 root@deploy2002: Unlocked for deployment [ALL REPOSITORIES]: Datacenter switchover from eqiad to codfw - [[phab:T418133|T418133]] (duration: 25m 37s)
* 16:39 root@deploy2002: Forcefully removing global lock: Datacenter switchover from eqiad to codfw - [[phab:T418133|T418133]]
* 16:39 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-unlock-scap for datacenter switchover from eqiad to codfw
* 16:39 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:27 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:27 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:27 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-run-puppet-on-db-masters for datacenter switchover from eqiad to codfw
* 16:27 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.09-restore-ttl (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:26 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.09-restore-ttl for datacenter switchover from eqiad to codfw
* 16:26 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-start-maintenance (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:26 root@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-cron: apply
* 16:26 root@deploy2002: helmfile [codfw] START helmfile.d/services/mw-cron: apply
* 16:26 root@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-cron: apply
* 16:26 root@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-cron: apply
* 16:26 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-start-maintenance for datacenter switchover from eqiad to codfw
* 16:25 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:25 root@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-jobrunner: sync
* 16:25 root@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-jobrunner: sync
* 16:25 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.08-restart-mw-jobrunner for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.07-set-readwrite (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: [DRY-RUN] MediaWiki read-only period ends at: 2026-03-04 16:24:40.502004
* 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.07-set-readwrite for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:24 ayounsi@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.06-set-db-readwrite for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:24 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.04-switch-mediawiki for datacenter switchover from eqiad to codfw
* 16:23 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.03-set-db-readonly (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:23 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.03-set-db-readonly for datacenter switchover from eqiad to codfw
* 16:23 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.02-set-readonly (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:22 blake@cumin1003: [DRY-RUN] MediaWiki read-only period starts at: 2026-03-04 16:22:41.755892
* 16:22 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.02-set-readonly for datacenter switchover from eqiad to codfw
* 16:20 ayounsi@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on aux-k8s-worker1007.eqiad.wmnet with reason: host reimage
* 16:20 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.01-stop-maintenance (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:20 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.01-stop-maintenance for datacenter switchover from eqiad to codfw
* 16:19 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-reduce-ttl (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:14 moritzm: upgrading cloudservices* to Bird 2.18 [[phab:T413740|T413740]]
* 16:14 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-reduce-ttl for datacenter switchover from eqiad to codfw
* 16:13 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-lock-scap (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:13 root@deploy2002: Locking from deployment [ALL REPOSITORIES]: Datacenter switchover from eqiad to codfw - [[phab:T418133|T418133]]
* 16:13 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-lock-scap for datacenter switchover from eqiad to codfw
* 16:13 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:10 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4006.ulsfo.wmnet
* 16:10 moritzm: remove ganeti4005 from ganeti/ulsfo cluster [[phab:T418993|T418993]]
* 16:10 ayounsi@cumin1003: START - Cookbook sre.hosts.reimage for host aux-k8s-worker1007.eqiad.wmnet with OS bookworm
* 16:06 blake@cumin1003: END (PASS) - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks (exit_code=0) for datacenter switchover from eqiad to codfw
* 16:06 blake@cumin1003: START - Cookbook sre.switchdc.mediawiki.00-downtime-db-readonly-checks for datacenter switchover from eqiad to codfw
* 15:59 XioNoX: push pfw policies - [[phab:T418402|T418402]]
* 15:37 sukhe@dns1004: END - running authdns-update
* 15:36 sukhe@dns1004: START - running authdns-update
* 15:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1219.eqiad.wmnet
* 15:32 aqu@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/blunderbuss: apply
* 15:31 aqu@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/blunderbuss: apply
* 15:29 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
* 15:29 cgoubert@cumin1003: END (PASS) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=0) rolling restart_daemons on P<nowiki>{</nowiki>ms-fe10[14-24].*<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 15:24 cgoubert@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on P<nowiki>{</nowiki>ms-fe10[14-24].*<nowiki>}</nowiki> and (A:swift-fe or A:swift-fe-canary or A:swift-fe-codfw or A:swift-fe-eqiad)
* 15:22 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:22 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 15:22 cgoubert@cumin1003: END (ERROR) - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies (exit_code=97) rolling restart_daemons on A:swift-fe-eqiad
* 15:21 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1219.eqiad.wmnet
* 15:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1218.eqiad.wmnet
* 15:19 cgoubert@cumin1003: START - Cookbook sre.swift.roll-restart-reboot-swift-ms-proxies rolling restart_daemons on A:swift-fe-eqiad
* 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1120.eqiad.wmnet
* 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1121.eqiad.wmnet
* 15:16 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp1115.eqiad.wmnet [reason: [[phab:T418772|T418772]] - BGP maintenance]
* 15:16 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:16 ayounsi@cumin1003: conftool action : set/pooled=yes; selector: name=cirrussearch1122.eqiad.wmnet
* 15:15 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:15 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:14 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:13 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:13 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 15:10 XioNoX: lsw1-d7-eqiad# tools network-instance default protocols bgp neighbor 10.64.128.17 reset-peer - [[phab:T418772|T418772]]
* 15:10 jmm@cumin2002: END (PASS) - Cookbook sre.kafka.roll-restart-reboot-brokers (exit_code=0) rolling restart_daemons on A:kafka-test-eqiad
* 15:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1218.eqiad.wmnet
* 15:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1217.eqiad.wmnet
* 15:09 jforrester@deploy2002: helmfile [eqiad] DONE helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy2002: helmfile [eqiad] START helmfile.d/services/wikifunctions: apply
* 15:08 jforrester@deploy2002: helmfile [codfw] DONE helmfile.d/services/wikifunctions: apply
* 15:05 jforrester@deploy2002: helmfile [codfw] START helmfile.d/services/wikifunctions: apply
* 15:05 moritzm: upgrading cloudlb* to Bird 2.18 [[phab:T413740|T413740]]
* 15:05 jforrester@deploy2002: helmfile [staging] DONE helmfile.d/services/wikifunctions: apply
* 15:04 jforrester@deploy2002: helmfile [staging] START helmfile.d/services/wikifunctions: apply
* 14:58 Dreamy_Jazz: Afternoon UTC backport window done
* 14:58 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244373{{!}}zhwiki: Remove all rights from accountcreator (T418089)]] (duration: 08m 12s)
* 14:57 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:57 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1217.eqiad.wmnet
* 14:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1216.eqiad.wmnet
* 14:57 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 14:56 btullis@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 6 days, 0:00:00 on dse-k8s-worker[1010-1011,1013,1018-1019].eqiad.wmnet with reason: Adding 10 Gbps NIC
* 14:54 dreamyjazz@deploy2002: dreamyjazz, 1f616emo: Continuing with sync
* 14:52 jmm@cumin2002: START - Cookbook sre.kafka.roll-restart-reboot-brokers rolling restart_daemons on A:kafka-test-eqiad
* 14:52 dreamyjazz@deploy2002: dreamyjazz, 1f616emo: Backport for [[gerrit:1244373{{!}}zhwiki: Remove all rights from accountcreator (T418089)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:50 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1244373{{!}}zhwiki: Remove all rights from accountcreator (T418089)]]
* 14:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1216.eqiad.wmnet
* 14:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1215.eqiad.wmnet
* 14:44 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248009{{!}}Hooks: Fix liquidthreads log type definition bugs (T417425 T419006)]], [[gerrit:1248008{{!}}Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425)]] (duration: 07m 11s)
* 14:44 cdobbins@cumin2002: conftool action : set/pooled=no; selector: name=cp1115.eqiad.wmnet [reason: [[phab:T418772|T418772]] - BGP maintenance]
* 14:44 taavi: updating CR firewall policy with https://gerrit.wikimedia.org/r/c/operations/homer/public/+/970275
* 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1122.eqiad.wmnet
* 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1121.eqiad.wmnet
* 14:42 ayounsi@cumin1003: conftool action : set/pooled=no; selector: name=cirrussearch1120.eqiad.wmnet
* 14:40 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 14:39 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1248009{{!}}Hooks: Fix liquidthreads log type definition bugs (T417425 T419006)]], [[gerrit:1248008{{!}}Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:37 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1248009{{!}}Hooks: Fix liquidthreads log type definition bugs (T417425 T419006)]], [[gerrit:1248008{{!}}Define $wgWikimediaMessagesHasLiquidThreadsLogs (T417425)]]
* 14:33 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1215.eqiad.wmnet
* 14:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1214.eqiad.wmnet
* 14:32 btullis@puppetserver1001: conftool action : get/pooled=no; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1025.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/pooled=yes; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1028.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1025.eqiad.wmnet
* 14:31 btullis@puppetserver1001: conftool action : set/weight=1; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
* 14:30 btullis@puppetserver1001: conftool action : get/pooled; selector: service=kubesvc,cluster=dse-k8s,dc=eqiad,name=dse-k8s-worker1024.eqiad.wmnet
* 14:29 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_esams and A:cp - 3.0 upgrade ()
* 14:27 arnaudb@dns1004: END - running authdns-update
* 14:26 arnaudb@dns1004: START - running authdns-update
* 14:26 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1248000{{!}}Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999)]] (duration: 07m 19s)
* 14:22 tgr@deploy2002: tgr: Continuing with sync
* 14:21 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1214.eqiad.wmnet
* 14:21 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1213.eqiad.wmnet
* 14:21 tgr@deploy2002: tgr: Backport for [[gerrit:1248000{{!}}Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:19 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1248000{{!}}Revert "Enable JWT session cookie for bot passwords (all wikis) (attempt #2)" (T415007 T418999)]]
* 14:14 sgimeno@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247566{{!}}Enable new HTML confirmation emails for all (T416748)]] (duration: 07m 46s)
* 14:13 elukey@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 14:13 elukey@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 14:10 sgimeno@deploy2002: migr, sgimeno: Continuing with sync
* 14:09 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1213.eqiad.wmnet
* 14:09 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1212.eqiad.wmnet
* 14:09 sgimeno@deploy2002: migr, sgimeno: Backport for [[gerrit:1247566{{!}}Enable new HTML confirmation emails for all (T416748)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:08 elukey@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 14:08 elukey@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 14:08 elukey@deploy2002: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:07 elukey@deploy2002: helmfile [ml-staging-codfw] START helmfile.d/admin 'sync'.
* 14:07 sgimeno@deploy2002: Started scap sync-world: Backport for [[gerrit:1247566{{!}}Enable new HTML confirmation emails for all (T416748)]]
* 13:59 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
* 13:58 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti4005.ulsfo.wmnet
* 13:57 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti4005.ulsfo.wmnet
* 13:57 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1212.eqiad.wmnet
* 13:57 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1211.eqiad.wmnet
* 13:49 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_esams and A:cp - 3.0 upgrade ()
* 13:45 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1211.eqiad.wmnet
* 13:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1210.eqiad.wmnet
* 13:43 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_esams and A:cp - 3.0 upgrade ()
* 13:40 arnaudb@dns1004: END - running authdns-update
* 13:39 arnaudb@dns1004: START - running authdns-update
* 13:37 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 13:33 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1210.eqiad.wmnet
* 13:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1209.eqiad.wmnet
* 13:20 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1209.eqiad.wmnet
* 13:20 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1208.eqiad.wmnet
* 13:17 aokoth@deploy2002: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 13:17 aokoth@deploy2002: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 13:16 aokoth@deploy2002: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 13:15 aokoth@deploy2002: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 13:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1208.eqiad.wmnet
* 13:06 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1207.eqiad.wmnet
* 13:03 arnaudb@dns1005: END - running authdns-update
* 13:02 arnaudb@dns1005: START - running authdns-update
* 13:00 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_esams and A:cp - 3.0 upgrade ()
* 13:00 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 12:46 mvolz@deploy2002: helmfile [eqiad] DONE helmfile.d/services/citoid: apply
* 12:45 mvolz@deploy2002: helmfile [eqiad] START helmfile.d/services/citoid: apply
* 12:44 mvolz@deploy2002: helmfile [codfw] DONE helmfile.d/services/citoid: apply
* 12:44 mvolz@deploy2002: helmfile [codfw] START helmfile.d/services/citoid: apply
* 12:43 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 12:43 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
* 12:33 mvolz@deploy2002: helmfile [staging] DONE helmfile.d/services/citoid: apply
* 12:29 mvolz@deploy2002: helmfile [staging] START helmfile.d/services/citoid: apply
* 12:10 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 12:08 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp - 3.0 upgrade ()
* 12:03 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1207.eqiad.wmnet
* 12:03 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1206.eqiad.wmnet
* 11:53 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1206.eqiad.wmnet
* 11:53 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1205.eqiad.wmnet
* 11:36 jmm@cumin2002: END (PASS) - Cookbook sre.network.tls (exit_code=0) for network device lsw1-f8-eqiad
* 11:36 jmm@cumin2002: START - Cookbook sre.network.tls for network device lsw1-f8-eqiad
* 11:34 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqiad and A:cp - 3.0 upgrade ()
* 11:34 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqiad and A:cp - 3.0 upgrade ()
* 11:28 dreamyjazz@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247968{{!}}SI: Update instrumentation schema (T418293)]] (duration: 16m 22s)
* 11:22 fabfur: start upgrading haproxy to 3.0 on A:cp-eqiad ([[phab:T417253|T417253]])
* 11:22 dreamyjazz@deploy2002: dreamyjazz: Continuing with sync
* 11:17 dreamyjazz@deploy2002: dreamyjazz: Backport for [[gerrit:1247968{{!}}SI: Update instrumentation schema (T418293)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 11:13 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp - 3.0 upgrade ()
* 11:12 dreamyjazz@deploy2002: Started scap sync-world: Backport for [[gerrit:1247968{{!}}SI: Update instrumentation schema (T418293)]]
* 11:08 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp - 3.0 upgrade ()
* 11:07 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 11:07 blake@cumin1003: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 11:06 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2356].codfw.wmnet
* 11:06 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2356].codfw.wmnet
* 11:03 blake@cumin1003: END (FAIL) - Cookbook sre.k8s.roll-reimage-nodes (exit_code=1) rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 11:03 blake@cumin1003: START - Cookbook sre.k8s.roll-reimage-nodes rolling reimage on P<nowiki>{</nowiki>wikikube-worker[2332-2356].codfw.wmnet<nowiki>}</nowiki> and (A:wikikube-master-codfw or A:wikikube-worker-codfw)
* 10:55 blake@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[2332-2356].codfw.wmnet
* 10:55 blake@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[2332-2356].codfw.wmnet
* 10:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1205.eqiad.wmnet
* 10:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1204.eqiad.wmnet
* 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:42 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1204.eqiad.wmnet
* 10:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1203.eqiad.wmnet
* 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:28 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1203.eqiad.wmnet
* 10:28 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1202.eqiad.wmnet
* 10:25 fabfur: start upgrading haproxy to 3.0 on A:cp-drmrs ([[phab:T417253|T417253]])
* 10:25 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_drmrs and A:cp - 3.0 upgrade ()
* 10:25 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_drmrs and A:cp - 3.0 upgrade ()
* 10:24 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247941{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]], [[gerrit:1247944{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]] (duration: 06m 42s)
* 10:22 arnaudb@dns1004: END - running authdns-update
* 10:20 arnaudb@dns1004: START - running authdns-update
* 10:20 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 10:20 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1247941{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]], [[gerrit:1247944{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 10:18 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1247941{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]], [[gerrit:1247944{{!}}WebPHandler: Allow the original being served on the web (T414805 T418745 T418346)]]
* 10:16 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1202.eqiad.wmnet
* 10:16 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1201.eqiad.wmnet
* 10:09 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:08 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:06 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 10:05 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 10:04 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1201.eqiad.wmnet
* 10:04 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1200.eqiad.wmnet
* 09:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:50 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1200.eqiad.wmnet
* 09:39 mszwarc@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247925{{!}}Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880)]] (duration: 08m 23s)
* 09:36 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_codfw and A:cp - 3.0 upgrade ()
* 09:35 mszwarc@deploy2002: mszwarc: Continuing with sync
* 09:33 mszwarc@deploy2002: mszwarc: Backport for [[gerrit:1247925{{!}}Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hardware.upgrade-firmware (exit_code=0) upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:32 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host pki-root1002.eqiad.wmnet
* 09:31 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp - 3.0 upgrade ()
* 09:31 mszwarc@deploy2002: Started scap sync-world: Backport for [[gerrit:1247925{{!}}Require 2FA from CentralNotice admins and WMF Trust & Safety (T418580 T417880)]]
* 09:30 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 09:21 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=99) upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:20 jmm@cumin2002: END (ERROR) - Cookbook sre.hardware.upgrade-firmware (exit_code=97) upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 09:03 gehel: switching off Blazegraph on wdqs2009 (legacy full graph endpoint is end of life) - [[phab:T411410|T411410]] / [[phab:T415073|T415073]]
* 09:03 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:02 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 09:02 mvernon@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1096.eqiad.wmnet
* 09:00 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 09:00 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 08:59 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:58 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 08:57 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 08:56 mvernon@cumin1003: START - Cookbook sre.hosts.reboot-single for host ms-be1096.eqiad.wmnet
* 08:54 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 08:52 jmm@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts pki-root1002.eqiad.wmnet
* 08:49 topranks: disabling IBGP session between ssw1-d1-eqiad and ssw1-d8-eqiad to remove backup paths try #2 [[phab:T411054|T411054]]
* 08:36 jynus@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on backup1007.eqiad.wmnet,dbprov1004.eqiad.wmnet with reason: network maintenance
* 08:32 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:31 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 08:21 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_codfw and A:cp - 3.0 upgrade ()
* 08:21 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_codfw and A:cp - 3.0 upgrade ()
* 08:11 fabfur@cumin1003: conftool action : set/pooled=yes; selector: name=cp5032.*
* 07:54 topranks: disabling IBGP session between ssw1-d1-eqiad and ssw1-d8-eqiad to remove backup paths [[phab:T411054|T411054]]
* 07:43 moritzm: installing libbpf updates from Bookworm point release
* 05:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2179.codfw.wmnet with reason: Maintenance
* 05:42 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1160.eqiad.wmnet with reason: Maintenance
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 04s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1007.eqiad.wmnet with reason: Maintenance
* 01:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89793 and previous config saved to /var/cache/conftool/dbconfig/20260304-015657-marostegui.json
* 01:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P89792 and previous config saved to /var/cache/conftool/dbconfig/20260304-014150-marostegui.json
* 01:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263', diff saved to https://phabricator.wikimedia.org/P89791 and previous config saved to /var/cache/conftool/dbconfig/20260304-012642-marostegui.json
* 01:23 zabe@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 01:22 zabe@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1263 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89790 and previous config saved to /var/cache/conftool/dbconfig/20260304-011134-marostegui.json
* 00:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1263 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89789 and previous config saved to /var/cache/conftool/dbconfig/20260304-004638-marostegui.json
* 00:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1263.eqiad.wmnet with reason: Maintenance
* 00:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89788 and previous config saved to /var/cache/conftool/dbconfig/20260304-004615-marostegui.json
* 00:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P89787 and previous config saved to /var/cache/conftool/dbconfig/20260304-003107-marostegui.json
* 00:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262', diff saved to https://phabricator.wikimedia.org/P89786 and previous config saved to /var/cache/conftool/dbconfig/20260304-001559-marostegui.json
* 00:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1262 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89785 and previous config saved to /var/cache/conftool/dbconfig/20260304-000052-marostegui.json
== 2026-03-03 ==
* 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1262 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89784 and previous config saved to /var/cache/conftool/dbconfig/20260303-233500-marostegui.json
* 23:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1262.eqiad.wmnet with reason: Maintenance
* 23:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89783 and previous config saved to /var/cache/conftool/dbconfig/20260303-233436-marostegui.json
* 23:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P89782 and previous config saved to /var/cache/conftool/dbconfig/20260303-231929-marostegui.json
* 23:10 rzl@deploy2002: helmfile [codfw] DONE helmfile.d/services/mw-experimental: apply
* 23:08 rzl@deploy2002: helmfile [codfw] START helmfile.d/services/mw-experimental: apply
* 23:08 rzl@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 23:07 rzl@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 23:05 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 23:05 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 23:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261', diff saved to https://phabricator.wikimedia.org/P89781 and previous config saved to /var/cache/conftool/dbconfig/20260303-230421-marostegui.json
* 23:04 bking@cumin2002: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) depool for host dse-k8s-worker1028.eqiad.wmnet
* 23:02 tgr@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247689{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247690{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247596{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007)]] (duration: 21m 47s)
* 23:00 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7008.magru.wmnet [reason: lldpd packet drop issues]
* 22:58 cdobbins@cumin2002: conftool action : set/pooled=yes; selector: name=cp7008 [reason: lldpd packet drop issues]
* 22:58 tgr@deploy2002: tgr: Continuing with sync
* 22:56 bking@cumin2002: START - Cookbook sre.k8s.pool-depool-node depool for host dse-k8s-worker1028.eqiad.wmnet
* 22:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1261 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89780 and previous config saved to /var/cache/conftool/dbconfig/20260303-224913-marostegui.json
* 22:45 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:45 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:44 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:44 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 22:42 tgr@deploy2002: tgr: Backport for [[gerrit:1247689{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247690{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247596{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:40 tgr@deploy2002: Started scap sync-world: Backport for [[gerrit:1247689{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247690{{!}}Do not invalidate anon sessions with non-anon JWT cookies (T415007)]], [[gerrit:1247596{{!}}Enable JWT session cookie for bot passwords (all wikis) (attempt #2) (T415007)]]
* 22:26 bking@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-test: apply
* 22:26 bking@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-test: apply
* 22:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1261 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89779 and previous config saved to /var/cache/conftool/dbconfig/20260303-222324-marostegui.json
* 22:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1261.eqiad.wmnet with reason: Maintenance
* 22:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89778 and previous config saved to /var/cache/conftool/dbconfig/20260303-222301-marostegui.json
* 22:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P89777 and previous config saved to /var/cache/conftool/dbconfig/20260303-220754-marostegui.json
* 21:59 rzl@deploy2002: Finished scap sync-world: https://gerrit.wikimedia.org/r/1245162 [[phab:T411807|T411807]] (duration: 12m 15s)
* 21:58 rzl@deploy2002: rzl: Continuing with sync
* 21:56 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:56 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:55 rzl@deploy2002: rzl: https://gerrit.wikimedia.org/r/1245162 [[phab:T411807|T411807]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:54 rzl@deploy2002: Started scap sync-world: https://gerrit.wikimedia.org/r/1245162 [[phab:T411807|T411807]]
* 21:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260', diff saved to https://phabricator.wikimedia.org/P89776 and previous config saved to /var/cache/conftool/dbconfig/20260303-215247-marostegui.json
* 21:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89775 and previous config saved to /var/cache/conftool/dbconfig/20260303-214931-marostegui.json
* 21:48 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2045.codfw.wmnet
* 21:48 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp2045.codfw.wmnet
* 21:40 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:39 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 21:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1260 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89774 and previous config saved to /var/cache/conftool/dbconfig/20260303-213739-marostegui.json
* 21:35 jhuneidi@deploy2002: Finished scap sync-world: Backport for [[gerrit:1244748{{!}}REST: show the beta Attribution API in the REST Sandbox (T418522)]] (duration: 07m 41s)
* 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P89773 and previous config saved to /var/cache/conftool/dbconfig/20260303-213423-marostegui.json
* 21:32 jhuneidi@deploy2002: jhuneidi, bpirkle: Continuing with sync
* 21:30 jhuneidi@deploy2002: jhuneidi, bpirkle: Backport for [[gerrit:1244748{{!}}REST: show the beta Attribution API in the REST Sandbox (T418522)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:28 jhuneidi@deploy2002: Started scap sync-world: Backport for [[gerrit:1244748{{!}}REST: show the beta Attribution API in the REST Sandbox (T418522)]]
* 21:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248', diff saved to https://phabricator.wikimedia.org/P89772 and previous config saved to /var/cache/conftool/dbconfig/20260303-211915-marostegui.json
* 21:18 jhuneidi@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247652{{!}}Remove redundant mw-extra wgRestSandboxSpecs entry]] (duration: 06m 56s)
* 21:14 jhuneidi@deploy2002: jhuneidi, aaron: Continuing with sync
* 21:13 jhuneidi@deploy2002: jhuneidi, aaron: Backport for [[gerrit:1247652{{!}}Remove redundant mw-extra wgRestSandboxSpecs entry]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:11 jhuneidi@deploy2002: Started scap sync-world: Backport for [[gerrit:1247652{{!}}Remove redundant mw-extra wgRestSandboxSpecs entry]]
* 21:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1260 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89771 and previous config saved to /var/cache/conftool/dbconfig/20260303-211033-marostegui.json
* 21:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1260.eqiad.wmnet with reason: Maintenance
* 21:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89770 and previous config saved to /var/cache/conftool/dbconfig/20260303-211009-marostegui.json
* 21:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89769 and previous config saved to /var/cache/conftool/dbconfig/20260303-210407-marostegui.json
* 20:58 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2045.codfw.wmnet with reason: troubleshooting for [[phab:T418527|T418527]]
* 20:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P89768 and previous config saved to /var/cache/conftool/dbconfig/20260303-205502-marostegui.json
* 20:51 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp7008.magru.wmnet with OS trixie
* 20:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89767 and previous config saved to /var/cache/conftool/dbconfig/20260303-204452-marostegui.json
* 20:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2248.codfw.wmnet with reason: Maintenance
* 20:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89766 and previous config saved to /var/cache/conftool/dbconfig/20260303-204439-marostegui.json
* 20:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252', diff saved to https://phabricator.wikimedia.org/P89765 and previous config saved to /var/cache/conftool/dbconfig/20260303-203954-marostegui.json
* 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:34 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 20:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P89764 and previous config saved to /var/cache/conftool/dbconfig/20260303-202931-marostegui.json
* 20:24 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp7008.magru.wmnet with reason: host reimage
* 20:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1252 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89763 and previous config saved to /var/cache/conftool/dbconfig/20260303-202447-marostegui.json
* 20:17 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp7008.magru.wmnet with reason: host reimage
* 20:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247', diff saved to https://phabricator.wikimedia.org/P89762 and previous config saved to /var/cache/conftool/dbconfig/20260303-201423-marostegui.json
* 20:10 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1199.eqiad.wmnet
* 19:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89761 and previous config saved to /var/cache/conftool/dbconfig/20260303-195916-marostegui.json
* 19:59 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1252 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89760 and previous config saved to /var/cache/conftool/dbconfig/20260303-195900-marostegui.json
* 19:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1252.eqiad.wmnet with reason: Maintenance
* 19:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89759 and previous config saved to /var/cache/conftool/dbconfig/20260303-195835-marostegui.json
* 19:51 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp7008.magru.wmnet with OS trixie
* 19:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P89758 and previous config saved to /var/cache/conftool/dbconfig/20260303-194327-marostegui.json
* 19:42 brett@cumin2002: END (PASS) - Cookbook sre.hosts.remove-downtime (exit_code=0) for cp2043.codfw.wmnet
* 19:42 brett@cumin2002: START - Cookbook sre.hosts.remove-downtime for cp2043.codfw.wmnet
* 19:33 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89757 and previous config saved to /var/cache/conftool/dbconfig/20260303-193351-marostegui.json
* 19:33 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2247.codfw.wmnet with reason: Maintenance
* 19:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89756 and previous config saved to /var/cache/conftool/dbconfig/20260303-193338-marostegui.json
* 19:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249', diff saved to https://phabricator.wikimedia.org/P89755 and previous config saved to /var/cache/conftool/dbconfig/20260303-192820-marostegui.json
* 19:19 jhuneidi@deploy2002: rebuilt and synchronized wikiversions files: group0 to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 19:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P89754 and previous config saved to /var/cache/conftool/dbconfig/20260303-191830-marostegui.json
* 19:17 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host cp2047.codfw.wmnet with OS trixie
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1249 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89753 and previous config saved to /var/cache/conftool/dbconfig/20260303-191312-marostegui.json
* 19:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246', diff saved to https://phabricator.wikimedia.org/P89752 and previous config saved to /var/cache/conftool/dbconfig/20260303-190323-marostegui.json
* 18:53 cdobbins@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 18:49 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1199.eqiad.wmnet
* 18:49 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1198.eqiad.wmnet
* 18:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1249 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89751 and previous config saved to /var/cache/conftool/dbconfig/20260303-184937-marostegui.json
* 18:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1249.eqiad.wmnet with reason: Maintenance
* 18:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89750 and previous config saved to /var/cache/conftool/dbconfig/20260303-184913-marostegui.json
* 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2246 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89749 and previous config saved to /var/cache/conftool/dbconfig/20260303-184815-marostegui.json
* 18:47 cdobbins@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on cp2047.codfw.wmnet with reason: host reimage
* 18:45 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1096.eqiad.wmnet with OS bullseye
* 18:36 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1198.eqiad.wmnet
* 18:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1197.eqiad.wmnet
* 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P89747 and previous config saved to /var/cache/conftool/dbconfig/20260303-183406-marostegui.json
* 18:29 cdobbins@cumin2002: START - Cookbook sre.hosts.reimage for host cp2047.codfw.wmnet with OS trixie
* 18:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1197.eqiad.wmnet
* 18:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1196.eqiad.wmnet
* 18:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2246 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89746 and previous config saved to /var/cache/conftool/dbconfig/20260303-182346-marostegui.json
* 18:23 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1096.eqiad.wmnet with reason: host reimage
* 18:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2246.codfw.wmnet with reason: Maintenance
* 18:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89745 and previous config saved to /var/cache/conftool/dbconfig/20260303-182321-marostegui.json
* 18:19 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1096.eqiad.wmnet with reason: host reimage
* 18:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248', diff saved to https://phabricator.wikimedia.org/P89744 and previous config saved to /var/cache/conftool/dbconfig/20260303-181859-marostegui.json
* 18:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1196.eqiad.wmnet
* 18:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1195.eqiad.wmnet
* 18:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P89743 and previous config saved to /var/cache/conftool/dbconfig/20260303-180814-marostegui.json
* 18:04 jforrester@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247635{{!}}Style fixes for copy-paste feature (T414072)]] (duration: 32m 54s)
* 18:04 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 18:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89742 and previous config saved to /var/cache/conftool/dbconfig/20260303-180352-marostegui.json
* 18:02 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1096.eqiad.wmnet with OS bullseye
* 18:02 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 17:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1195.eqiad.wmnet
* 17:59 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host an-worker1194.eqiad.wmnet
* 17:55 ariel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 17:53 ariel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 17:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245', diff saved to https://phabricator.wikimedia.org/P89741 and previous config saved to /var/cache/conftool/dbconfig/20260303-175304-marostegui.json
* 17:52 jforrester@deploy2002: jforrester: Continuing with sync
* 17:51 jforrester@deploy2002: jforrester: Backport for [[gerrit:1247635{{!}}Style fixes for copy-paste feature (T414072)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 17:47 ariel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 17:46 ariel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 17:41 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1194.eqiad.wmnet
* 17:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1193.eqiad.wmnet
* 17:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1248 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89740 and previous config saved to /var/cache/conftool/dbconfig/20260303-173914-marostegui.json
* 17:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1248.eqiad.wmnet with reason: Maintenance
* 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89739 and previous config saved to /var/cache/conftool/dbconfig/20260303-173850-marostegui.json
* 17:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2245 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89738 and previous config saved to /var/cache/conftool/dbconfig/20260303-173756-marostegui.json
* 17:31 jforrester@deploy2002: Started scap sync-world: Backport for [[gerrit:1247635{{!}}Style fixes for copy-paste feature (T414072)]]
* 17:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1193.eqiad.wmnet
* 17:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1192.eqiad.wmnet
* 17:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P89736 and previous config saved to /var/cache/conftool/dbconfig/20260303-172343-marostegui.json
* 17:18 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1192.eqiad.wmnet
* 17:18 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1191.eqiad.wmnet
* 17:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2245 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89735 and previous config saved to /var/cache/conftool/dbconfig/20260303-171149-marostegui.json
* 17:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2245.codfw.wmnet with reason: Maintenance
* 17:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89734 and previous config saved to /var/cache/conftool/dbconfig/20260303-171126-marostegui.json
* 17:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247', diff saved to https://phabricator.wikimedia.org/P89733 and previous config saved to /var/cache/conftool/dbconfig/20260303-170835-marostegui.json
* 17:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1191.eqiad.wmnet
* 17:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1190.eqiad.wmnet
* 16:56 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1190.eqiad.wmnet
* 16:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P89732 and previous config saved to /var/cache/conftool/dbconfig/20260303-165618-marostegui.json
* 16:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89731 and previous config saved to /var/cache/conftool/dbconfig/20260303-165327-marostegui.json
* 16:41 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1189.eqiad.wmnet
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240', diff saved to https://phabricator.wikimedia.org/P89730 and previous config saved to /var/cache/conftool/dbconfig/20260303-164111-marostegui.json
* 16:34 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 16:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1189.eqiad.wmnet
* 16:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1188.eqiad.wmnet
* 16:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1247 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89729 and previous config saved to /var/cache/conftool/dbconfig/20260303-162845-marostegui.json
* 16:28 fceratto@cumin1003: dbctl commit (dc=all): 'Setting x1 codfw weights to 300 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89728 and previous config saved to /var/cache/conftool/dbconfig/20260303-162836-fceratto.json
* 16:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1247.eqiad.wmnet with reason: Maintenance
* 16:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2240 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89727 and previous config saved to /var/cache/conftool/dbconfig/20260303-162603-marostegui.json
* 16:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 16:18 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1188 weight to 100 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89726 and previous config saved to /var/cache/conftool/dbconfig/20260303-161846-fceratto.json
* 16:18 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 16:17 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1188.eqiad.wmnet
* 16:17 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1187.eqiad.wmnet
* 16:14 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.depool (exit_code=0) depool db1166: testing:crash
* 16:14 marostegui@cumin1003: START - Cookbook sre.mysql.depool depool db1166: testing:crash
* 16:13 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1169 weight to 300 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89724 and previous config saved to /var/cache/conftool/dbconfig/20260303-161323-fceratto.json
* 16:12 fceratto@cumin1003: dbctl commit (dc=all): 'Setting db1188 weight to 300 [[phab:T416705|T416705]]', diff saved to https://phabricator.wikimedia.org/P89723 and previous config saved to /var/cache/conftool/dbconfig/20260303-161230-fceratto.json
* 16:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 16:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89722 and previous config saved to /var/cache/conftool/dbconfig/20260303-160720-marostegui.json
* 16:07 brennen@deploy2002: Finished deploy [phabricator/deployment@a883b6d]: deploy phab1004 for [[phab:T418872|T418872]] (duration: 01m 07s)
* 16:06 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1187.eqiad.wmnet
* 16:06 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1186.eqiad.wmnet
* 16:05 brennen@deploy2002: Started deploy [phabricator/deployment@a883b6d]: deploy phab1004 for [[phab:T418872|T418872]]
* 16:05 brennen@deploy2002: Finished deploy [phabricator/deployment@a883b6d]: deploy phab2002 for [[phab:T418872|T418872]] (duration: 00m 32s)
* 16:04 brennen@deploy2002: Started deploy [phabricator/deployment@a883b6d]: deploy phab2002 for [[phab:T418872|T418872]]
* 16:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2240 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89721 and previous config saved to /var/cache/conftool/dbconfig/20260303-160207-marostegui.json
* 16:02 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab1004.eqiad.wmnet with reason: Phabricator deploy
* 16:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2240.codfw.wmnet with reason: Maintenance
* 16:01 jelto@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 0:30:00 on phab2002.codfw.wmnet with reason: Phabricator deploy
* 16:00 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247569{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]] (duration: 09m 28s)
* 15:54 zabe@deploy2002: zabe: Continuing with sync
* 15:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1186.eqiad.wmnet
* 15:54 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1185.eqiad.wmnet
* 15:54 zabe@deploy2002: zabe: Backport for [[gerrit:1247569{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 15:53 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 15:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P89720 and previous config saved to /var/cache/conftool/dbconfig/20260303-155212-marostegui.json
* 15:50 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1247569{{!}}ImageListPager: Use correct name field for batch lookups (T418327)]]
* 15:49 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 15:45 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:44 elukey@cumin1003: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 15:42 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1185.eqiad.wmnet
* 15:42 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1184.eqiad.wmnet
* 15:42 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:41 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 15:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2239.codfw.wmnet with reason: Maintenance
* 15:41 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 15:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89719 and previous config saved to /var/cache/conftool/dbconfig/20260303-154104-marostegui.json
* 15:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244', diff saved to https://phabricator.wikimedia.org/P89718 and previous config saved to /var/cache/conftool/dbconfig/20260303-153704-marostegui.json
* 15:36 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 15:36 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 15:30 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1184.eqiad.wmnet
* 15:30 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1183.eqiad.wmnet
* 15:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P89717 and previous config saved to /var/cache/conftool/dbconfig/20260303-152557-marostegui.json
* 15:23 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1178.eqiad.wmnet
* 15:22 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp5032.*<nowiki>}</nowiki> and A:cp - 3.0 upgrade ()
* 15:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1244 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89716 and previous config saved to /var/cache/conftool/dbconfig/20260303-152157-marostegui.json
* 15:19 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1183.eqiad.wmnet
* 15:19 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1182.eqiad.wmnet
* 15:16 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on P<nowiki>{</nowiki>cp5032.*<nowiki>}</nowiki> and A:cp - 3.0 upgrade ()
* 15:15 fabfur@cumin1003: END (FAIL) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=1) rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 15:14 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 15:14 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 15:13 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 15:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 15:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237', diff saved to https://phabricator.wikimedia.org/P89715 and previous config saved to /var/cache/conftool/dbconfig/20260303-151049-marostegui.json
* 15:08 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 15:07 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1182.eqiad.wmnet
* 15:07 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1181.eqiad.wmnet
* 14:57 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1244 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89714 and previous config saved to /var/cache/conftool/dbconfig/20260303-145727-marostegui.json
* 14:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1244.eqiad.wmnet with reason: Maintenance
* 14:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89713 and previous config saved to /var/cache/conftool/dbconfig/20260303-145704-marostegui.json
* 14:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2237 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89712 and previous config saved to /var/cache/conftool/dbconfig/20260303-145541-marostegui.json
* 14:55 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1181.eqiad.wmnet
* 14:55 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1180.eqiad.wmnet
* 14:49 moritzm: installing php7.4 security updates
* 14:46 jayme@cumin1003: END (FAIL) - Cookbook sre.k8s.pool-depool-node (exit_code=99) pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 14:46 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1352-1359].eqiad.wmnet
* 14:43 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1180.eqiad.wmnet
* 14:43 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1179.eqiad.wmnet
* 14:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P89711 and previous config saved to /var/cache/conftool/dbconfig/20260303-144156-marostegui.json
* 14:38 gkyziridis@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 14:38 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240716{{!}}Remove Editing-related config for special wikis (T400063)]] (duration: 06m 34s)
* 14:36 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:34 esanders@deploy2002: esanders: Continuing with sync
* 14:34 esanders@deploy2002: esanders: Backport for [[gerrit:1240716{{!}}Remove Editing-related config for special wikis (T400063)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:34 fceratto@deploy2002: helmfile [aux-k8s-eqiad] 'sync' command on namespace 'zarcillo' for release 'main' .
* 14:32 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1240716{{!}}Remove Editing-related config for special wikis (T400063)]]
* 14:31 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1179.eqiad.wmnet
* 14:31 btullis@cumin1003: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=99) for host an-worker1178.eqiad.wmnet
* 14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2237 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89710 and previous config saved to /var/cache/conftool/dbconfig/20260303-143141-marostegui.json
* 14:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2237.codfw.wmnet with reason: Maintenance
* 14:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89709 and previous config saved to /var/cache/conftool/dbconfig/20260303-143117-marostegui.json
* 14:29 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247578{{!}}PasteCheck: Enable by default (T405127)]] (duration: 08m 01s)
* 14:27 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 14:27 mvernon@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ms-be1097.eqiad.wmnet
* 14:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243', diff saved to https://phabricator.wikimedia.org/P89708 and previous config saved to /var/cache/conftool/dbconfig/20260303-142649-marostegui.json
* 14:26 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 14:25 esanders@deploy2002: esanders: Continuing with sync
* 14:23 esanders@deploy2002: esanders: Backport for [[gerrit:1247578{{!}}PasteCheck: Enable by default (T405127)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:21 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1247578{{!}}PasteCheck: Enable by default (T405127)]]
* 14:20 mvernon@cumin2002: START - Cookbook sre.hosts.reboot-single for host ms-be1097.eqiad.wmnet
* 14:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P89707 and previous config saved to /var/cache/conftool/dbconfig/20260303-141610-marostegui.json
* 14:15 esanders@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247576{{!}}Enable Wikibase GraphQL on test.wikidata.org (T417619)]], [[gerrit:1247577{{!}}Enable Wikibase GraphQL on production wikidata.org (T417619)]] (duration: 08m 17s)
* 14:11 esanders@deploy2002: esanders, jakob: Continuing with sync
* 14:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1243 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89706 and previous config saved to /var/cache/conftool/dbconfig/20260303-141142-marostegui.json
* 14:09 esanders@deploy2002: esanders, jakob: Backport for [[gerrit:1247576{{!}}Enable Wikibase GraphQL on test.wikidata.org (T417619)]], [[gerrit:1247577{{!}}Enable Wikibase GraphQL on production wikidata.org (T417619)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:07 esanders@deploy2002: Started scap sync-world: Backport for [[gerrit:1247576{{!}}Enable Wikibase GraphQL on test.wikidata.org (T417619)]], [[gerrit:1247577{{!}}Enable Wikibase GraphQL on production wikidata.org (T417619)]]
* 14:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236', diff saved to https://phabricator.wikimedia.org/P89704 and previous config saved to /var/cache/conftool/dbconfig/20260303-140102-marostegui.json
* 13:47 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1243 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89703 and previous config saved to /var/cache/conftool/dbconfig/20260303-134702-marostegui.json
* 13:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1243.eqiad.wmnet with reason: Maintenance
* 13:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89702 and previous config saved to /var/cache/conftool/dbconfig/20260303-134639-marostegui.json
* 13:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89701 and previous config saved to /var/cache/conftool/dbconfig/20260303-134554-marostegui.json
* 13:31 moritzm: installing NSS security updates
* 13:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P89700 and previous config saved to /var/cache/conftool/dbconfig/20260303-133131-marostegui.json
* 13:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89699 and previous config saved to /var/cache/conftool/dbconfig/20260303-132414-marostegui.json
* 13:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2236.codfw.wmnet with reason: Maintenance
* 13:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89698 and previous config saved to /var/cache/conftool/dbconfig/20260303-132350-marostegui.json
* 13:20 tappof: Thanos: re-enable querier<->ruler cross-site traffic [[phab:T412924|T412924]]
* 13:17 dpogorzelski@cumin1003: conftool action : set/pooled=true; selector: dnsdisc=recommendation-api,name=eqiad
* 13:17 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242', diff saved to https://phabricator.wikimedia.org/P89697 and previous config saved to /var/cache/conftool/dbconfig/20260303-131624-marostegui.json
* 13:16 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in eqiad/ml-serve-eqiad: maintenance
* 13:11 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1178.eqiad.wmnet
* 13:11 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1177.eqiad.wmnet
* 13:10 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1359.eqiad.wmnet with OS trixie
* 13:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P89696 and previous config saved to /var/cache/conftool/dbconfig/20260303-130842-marostegui.json
* 13:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1242 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89695 and previous config saved to /var/cache/conftool/dbconfig/20260303-130117-marostegui.json
* 13:01 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 13:00 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1177.eqiad.wmnet
* 13:00 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 13:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1176.eqiad.wmnet
* 12:59 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1358.eqiad.wmnet with OS trixie
* 12:56 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:55 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:53 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1359.eqiad.wmnet with reason: host reimage
* 12:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219', diff saved to https://phabricator.wikimedia.org/P89694 and previous config saved to /var/cache/conftool/dbconfig/20260303-125335-marostegui.json
* 12:52 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1357.eqiad.wmnet with OS trixie
* 12:51 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:50 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:48 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1359.eqiad.wmnet with reason: host reimage
* 12:48 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1356.eqiad.wmnet with OS trixie
* 12:47 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:47 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 12:47 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:47 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-drafttopic' for release 'main' .
* 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1176.eqiad.wmnet
* 12:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1175.eqiad.wmnet
* 12:46 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 12:45 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 12:45 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:45 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'recommendation-api-ng' for release 'main' .
* 12:44 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'readability' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 12:43 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 12:43 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1358.eqiad.wmnet with reason: host reimage
* 12:42 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 12:42 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 12:41 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'sync'.
* 12:40 ladsgroup@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247559{{!}}Enable thumb steps on private wikis too (T414805)]] (duration: 13m 01s)
* 12:39 dpogorzelski@deploy2002: helmfile [ml-serve-eqiad] START helmfile.d/admin 'sync'.
* 12:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89693 and previous config saved to /var/cache/conftool/dbconfig/20260303-123827-marostegui.json
* 12:36 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1357.eqiad.wmnet with reason: host reimage
* 12:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1242 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89692 and previous config saved to /var/cache/conftool/dbconfig/20260303-123642-marostegui.json
* 12:36 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1359.eqiad.wmnet with OS trixie
* 12:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1242.eqiad.wmnet with reason: Maintenance
* 12:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89691 and previous config saved to /var/cache/conftool/dbconfig/20260303-123619-marostegui.json
* 12:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1175.eqiad.wmnet
* 12:34 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1174.eqiad.wmnet
* 12:34 dpogorzelski@cumin1003: conftool action : set/pooled=false; selector: dnsdisc=recommendation-api,name=eqiad
* 12:33 ladsgroup@deploy2002: ladsgroup: Continuing with sync
* 12:33 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1356.eqiad.wmnet with reason: host reimage
* 12:31 daniel@deploy2002: helmfile [codfw] DONE helmfile.d/services/rest-gateway: apply
* 12:31 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:31 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:31 ladsgroup@deploy2002: ladsgroup: Backport for [[gerrit:1247559{{!}}Enable thumb steps on private wikis too (T414805)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 12:30 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in eqiad/ml-serve-eqiad: maintenance
* 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1358.eqiad.wmnet with reason: host reimage
* 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1357.eqiad.wmnet with reason: host reimage
* 12:28 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1356.eqiad.wmnet with reason: host reimage
* 12:27 daniel@deploy2002: helmfile [eqiad] DONE helmfile.d/services/rest-gateway: apply
* 12:27 ladsgroup@deploy2002: Started scap sync-world: Backport for [[gerrit:1247559{{!}}Enable thumb steps on private wikis too (T414805)]]
* 12:26 daniel@deploy2002: helmfile [eqiad] START helmfile.d/services/rest-gateway: apply
* 12:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1174.eqiad.wmnet
* 12:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1173.eqiad.wmnet
* 12:21 daniel@deploy2002: helmfile [staging] DONE helmfile.d/services/rest-gateway: apply
* 12:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P89690 and previous config saved to /var/cache/conftool/dbconfig/20260303-122112-marostegui.json
* 12:20 daniel@deploy2002: helmfile [staging] START helmfile.d/services/rest-gateway: apply
* 12:20 daniel@deploy2002: helmfile [codfw] START helmfile.d/services/rest-gateway: apply
* 12:19 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1353.eqiad.wmnet with OS trixie
* 12:16 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1358.eqiad.wmnet with OS trixie
* 12:16 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1357.eqiad.wmnet with OS trixie
* 12:15 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1356.eqiad.wmnet with OS trixie
* 12:14 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1355.eqiad.wmnet with OS trixie
* 12:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89689 and previous config saved to /var/cache/conftool/dbconfig/20260303-121420-marostegui.json
* 12:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2219.codfw.wmnet with reason: Maintenance
* 12:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89688 and previous config saved to /var/cache/conftool/dbconfig/20260303-121355-marostegui.json
* 12:09 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1354.eqiad.wmnet with OS trixie
* 12:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1173.eqiad.wmnet
* 12:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1172.eqiad.wmnet
* 12:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241', diff saved to https://phabricator.wikimedia.org/P89687 and previous config saved to /var/cache/conftool/dbconfig/20260303-120604-marostegui.json
* 12:04 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host wikikube-worker1352.eqiad.wmnet with OS trixie
* 12:02 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1353.eqiad.wmnet with reason: host reimage
* 11:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P89686 and previous config saved to /var/cache/conftool/dbconfig/20260303-115847-marostegui.json
* 11:58 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1355.eqiad.wmnet with reason: host reimage
* 11:52 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1354.eqiad.wmnet with reason: host reimage
* 11:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1241 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89685 and previous config saved to /var/cache/conftool/dbconfig/20260303-115057-marostegui.json
* 11:48 jayme@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on wikikube-worker1352.eqiad.wmnet with reason: host reimage
* 11:44 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1355.eqiad.wmnet with reason: host reimage
* 11:43 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1354.eqiad.wmnet with reason: host reimage
* 11:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210', diff saved to https://phabricator.wikimedia.org/P89684 and previous config saved to /var/cache/conftool/dbconfig/20260303-114341-marostegui.json
* 11:43 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1353.eqiad.wmnet with reason: host reimage
* 11:42 jayme@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on wikikube-worker1352.eqiad.wmnet with reason: host reimage
* 11:40 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 11:36 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 11:31 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1355.eqiad.wmnet with OS trixie
* 11:31 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1354.eqiad.wmnet with OS trixie
* 11:30 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1353.eqiad.wmnet with OS trixie
* 11:30 jayme@cumin1003: START - Cookbook sre.hosts.reimage for host wikikube-worker1352.eqiad.wmnet with OS trixie
* 11:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2210 ([[phab:T418465|T418465]])', diff saved to Unable to send diff to phaste and previous config saved to /var/cache/conftool/dbconfig/20260303-112828-marostegui.json
* 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1241 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89683 and previous config saved to /var/cache/conftool/dbconfig/20260303-112535-marostegui.json
* 11:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1241.eqiad.wmnet with reason: Maintenance
* 11:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89682 and previous config saved to /var/cache/conftool/dbconfig/20260303-112511-marostegui.json
* 11:21 jayme@deploy1003: helmfile [aux-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:18 jayme@deploy1003: helmfile [aux-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:18 jayme@deploy1003: helmfile [aux-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:17 jayme@deploy1003: helmfile [aux-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:17 jayme@deploy1003: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 11:16 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker[1350-1351].eqiad.wmnet
* 11:16 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker[1350-1351].eqiad.wmnet
* 11:15 jayme@deploy1003: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 11:15 jayme@deploy1003: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 11:15 jayme@deploy1003: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 11:15 jayme@deploy1003: helmfile [ml-staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:14 jayme@deploy1003: helmfile [ml-staging-codfw] START helmfile.d/admin 'apply'.
* 11:14 jayme@deploy1003: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'apply'.
* 11:13 jayme@deploy1003: helmfile [ml-serve-codfw] START helmfile.d/admin 'apply'.
* 11:13 jayme@deploy1003: helmfile [ml-serve-eqiad] DONE helmfile.d/admin 'apply'.
* 11:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1172.eqiad.wmnet
* 11:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1171.eqiad.wmnet
* 11:13 jayme@deploy1003: helmfile [ml-serve-eqiad] START helmfile.d/admin 'apply'.
* 11:13 jayme@deploy1003: helmfile [codfw] DONE helmfile.d/admin 'apply'.
* 11:12 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 11:11 jayme@deploy1003: helmfile [codfw] START helmfile.d/admin 'apply'.
* 11:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P89681 and previous config saved to /var/cache/conftool/dbconfig/20260303-111003-marostegui.json
* 11:09 jayme@deploy1003: helmfile [staging-codfw] DONE helmfile.d/admin 'apply'.
* 11:08 jayme@deploy1003: helmfile [staging-codfw] START helmfile.d/admin 'apply'.
* 11:08 jayme@deploy1003: helmfile [staging-eqiad] DONE helmfile.d/admin 'apply'.
* 11:07 jayme@deploy1003: helmfile [staging-eqiad] START helmfile.d/admin 'apply'.
* 11:07 jayme@deploy1003: helmfile [eqiad] DONE helmfile.d/admin 'apply'.
* 11:06 jayme@deploy1003: helmfile [eqiad] START helmfile.d/admin 'apply'.
* 11:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89680 and previous config saved to /var/cache/conftool/dbconfig/20260303-110551-marostegui.json
* 11:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2210.codfw.wmnet with reason: Maintenance
* 11:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89679 and previous config saved to /var/cache/conftool/dbconfig/20260303-110527-marostegui.json
* 10:59 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1171.eqiad.wmnet
* 10:59 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1170.eqiad.wmnet
* 10:57 slyngshede@dns1004: END - running authdns-update
* 10:55 slyngshede@dns1004: START - running authdns-update
* 10:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238', diff saved to https://phabricator.wikimedia.org/P89678 and previous config saved to /var/cache/conftool/dbconfig/20260303-105455-marostegui.json
* 10:54 hashar@deploy2002: Finished deploy [gerrit/gerrit@12177b1]: wm-checks-api: add tag for Selenium jobs (duration: 00m 13s)
* 10:54 hashar@deploy2002: Started deploy [gerrit/gerrit@12177b1]: wm-checks-api: add tag for Selenium jobs
* 10:51 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_eqsin and A:cp - 3.0 upgrade ()
* 10:51 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_eqsin and A:cp - 3.0 upgrade ()
* 10:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P89677 and previous config saved to /var/cache/conftool/dbconfig/20260303-105020-marostegui.json
* 10:47 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-reverted' for release 'main' .
* 10:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1170.eqiad.wmnet
* 10:45 fabfur: start upgrading haproxy to 3.0 on A:cp-eqsin ([[phab:T417253|T417253]])
* 10:43 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-goodfaith' for release 'main' .
* 10:41 moritzm: installing Django security updates
* 10:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1238 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89676 and previous config saved to /var/cache/conftool/dbconfig/20260303-103947-marostegui.json
* 10:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206', diff saved to https://phabricator.wikimedia.org/P89675 and previous config saved to /var/cache/conftool/dbconfig/20260303-103512-marostegui.json
* 10:34 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:33 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:33 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:32 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:31 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-editquality-damaging' for release 'main' .
* 10:25 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-draftquality' for release 'main' .
* 10:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89674 and previous config saved to /var/cache/conftool/dbconfig/20260303-102004-marostegui.json
* 10:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1238 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89673 and previous config saved to /var/cache/conftool/dbconfig/20260303-101800-marostegui.json
* 10:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1238.eqiad.wmnet with reason: Maintenance
* 10:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89672 and previous config saved to /var/cache/conftool/dbconfig/20260303-101747-marostegui.json
* 09:57 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articletopic' for release 'main' .
* 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89670 and previous config saved to /var/cache/conftool/dbconfig/20260303-095655-marostegui.json
* 09:56 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2206.codfw.wmnet with reason: Maintenance
* 09:53 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revscoring-articlequality' for release 'main' .
* 09:51 moritzm: installing qemu security updates
* 09:48 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revision-models' for release 'main' .
* 09:48 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revise-tone-task-generator' for release 'main' .
* 09:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221', diff saved to https://phabricator.wikimedia.org/P89669 and previous config saved to /var/cache/conftool/dbconfig/20260303-094732-marostegui.json
* 09:47 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 09:47 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 09:45 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'readability' for release 'main' .
* 09:45 elukey@deploy2002: helmfile [staging-eqiad] DONE helmfile.d/admin 'sync'.
* 09:44 elukey@deploy2002: helmfile [staging-eqiad] START helmfile.d/admin 'sync'.
* 09:44 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] 'sync' command on namespace 'ores-legacy' for release 'main' .
* 09:44 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'logo-detection' for release 'main' .
* 09:44 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db1176.eqiad.wmnet with reason: host reimage
* 09:43 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 09:40 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-models' for release 'main' .
* 09:38 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db1176.eqiad.wmnet with reason: host reimage
* 09:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2199.codfw.wmnet with reason: Maintenance
* 09:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89668 and previous config saved to /var/cache/conftool/dbconfig/20260303-093542-marostegui.json
* 09:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89667 and previous config saved to /var/cache/conftool/dbconfig/20260303-093224-marostegui.json
* 09:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'article-descriptions' for release 'main' .
* 09:23 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 09:23 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db1176.eqiad.wmnet with OS trixie
* 09:21 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 09:20 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 09:20 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 09:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P89666 and previous config saved to /var/cache/conftool/dbconfig/20260303-092034-marostegui.json
* 09:19 arnaudb@dns1004: END - running authdns-update
* 09:18 arnaudb@dns1004: START - running authdns-update
* 09:17 moritzm: installing libbpf updates from Bookworm point release
* 09:08 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89665 and previous config saved to /var/cache/conftool/dbconfig/20260303-090818-marostegui.json
* 09:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on 6 hosts with reason: Maintenance
* 09:07 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1221.eqiad.wmnet with reason: Maintenance
* 09:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89664 and previous config saved to /var/cache/conftool/dbconfig/20260303-090731-marostegui.json
* 09:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172', diff saved to https://phabricator.wikimedia.org/P89663 and previous config saved to /var/cache/conftool/dbconfig/20260303-090526-marostegui.json
* 08:54 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) pool all services in codfw/ml-serve-codfw: maintenance
* 08:53 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster pool all services in codfw/ml-serve-codfw: maintenance
* 08:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P89662 and previous config saved to /var/cache/conftool/dbconfig/20260303-085224-marostegui.json
* 08:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89661 and previous config saved to /var/cache/conftool/dbconfig/20260303-085019-marostegui.json
* 08:47 moritzm: powercycling lvs1013
* 08:41 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_ulsfo and A:cp - 3.0 upgrade ()
* 08:41 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_ulsfo and A:cp - 3.0 upgrade ()
* 08:37 fabfur: start upgrading haproxy to 3.0 on A:cp-ulsfo ([[phab:T417253|T417253]])
* 08:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199', diff saved to https://phabricator.wikimedia.org/P89660 and previous config saved to /var/cache/conftool/dbconfig/20260303-083716-marostegui.json
* 08:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'articletopic-outlink' for release 'main' .
* 08:32 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] Ran 'sync' command on namespace 'revertrisk' for release 'main' .
* 08:31 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] DONE helmfile.d/admin 'sync'.
* 08:30 dpogorzelski@deploy2002: helmfile [ml-serve-codfw] START helmfile.d/admin 'sync'.
* 08:28 dpogorzelski@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-cluster (exit_code=0) depool all services in codfw/ml-serve-codfw: maintenance
* 08:27 dpogorzelski@cumin1003: START - Cookbook sre.k8s.pool-depool-cluster depool all services in codfw/ml-serve-codfw: maintenance
* 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89659 and previous config saved to /var/cache/conftool/dbconfig/20260303-082424-marostegui.json
* 08:24 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2172.codfw.wmnet with reason: Maintenance
* 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89658 and previous config saved to /var/cache/conftool/dbconfig/20260303-082400-marostegui.json
* 08:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1199 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89657 and previous config saved to /var/cache/conftool/dbconfig/20260303-082209-marostegui.json
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P89656 and previous config saved to /var/cache/conftool/dbconfig/20260303-080853-marostegui.json
* 08:07 moritzm: installing PAM security updates on Bookworm
* 07:55 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1199 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89655 and previous config saved to /var/cache/conftool/dbconfig/20260303-075526-marostegui.json
* 07:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1199.eqiad.wmnet with reason: Maintenance
* 07:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89654 and previous config saved to /var/cache/conftool/dbconfig/20260303-075502-marostegui.json
* 07:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155', diff saved to https://phabricator.wikimedia.org/P89653 and previous config saved to /var/cache/conftool/dbconfig/20260303-075345-marostegui.json
* 07:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P89652 and previous config saved to /var/cache/conftool/dbconfig/20260303-073955-marostegui.json
* 07:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2155 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89651 and previous config saved to /var/cache/conftool/dbconfig/20260303-073838-marostegui.json
* 07:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190', diff saved to https://phabricator.wikimedia.org/P89650 and previous config saved to /var/cache/conftool/dbconfig/20260303-072447-marostegui.json
* 07:20 jmm@cumin2002: END (PASS) - Cookbook sre.ganeti.drain-node (exit_code=0) for draining ganeti node ganeti1051.eqiad.wmnet
* 07:18 jmm@cumin2002: START - Cookbook sre.ganeti.drain-node for draining ganeti node ganeti1051.eqiad.wmnet
* 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2155 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89649 and previous config saved to /var/cache/conftool/dbconfig/20260303-071054-marostegui.json
* 07:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2155.codfw.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89648 and previous config saved to /var/cache/conftool/dbconfig/20260303-071029-marostegui.json
* 07:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1190 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89647 and previous config saved to /var/cache/conftool/dbconfig/20260303-070940-marostegui.json
* 06:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P89646 and previous config saved to /var/cache/conftool/dbconfig/20260303-065523-marostegui.json
* 06:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1190 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89645 and previous config saved to /var/cache/conftool/dbconfig/20260303-064405-marostegui.json
* 06:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1190.eqiad.wmnet with reason: Maintenance
* 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147', diff saved to https://phabricator.wikimedia.org/P89644 and previous config saved to /var/cache/conftool/dbconfig/20260303-064015-marostegui.json
* 06:33 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) db2240 gradually with 4 steps - repool after schema change
* 06:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2147 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89642 and previous config saved to /var/cache/conftool/dbconfig/20260303-062507-marostegui.json
* 05:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2147 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89639 and previous config saved to /var/cache/conftool/dbconfig/20260303-055834-marostegui.json
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2147.codfw.wmnet with reason: Maintenance
* 05:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1150.eqiad.wmnet with reason: Maintenance
* 05:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2212.codfw.wmnet with reason: Maintenance
* 05:48 marostegui@cumin1003: START - Cookbook sre.mysql.pool db2240 gradually with 4 steps - repool after schema change
* 05:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1163.eqiad.wmnet with reason: Maintenance
* 05:01 mwpresync@deploy2002: Pruned MediaWiki: 1.46.0-wmf.15 (duration: 01m 10s)
* 04:43 mwpresync@deploy2002: Finished scap sync-world: testwikis to 1.46.0-wmf.18 refs [[phab:T413809|T413809]] (duration: 39m 43s)
* 04:03 mwpresync@deploy2002: Started scap sync-world: testwikis to 1.46.0-wmf.18 refs [[phab:T413809|T413809]]
* 03:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 03:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89637 and previous config saved to /var/cache/conftool/dbconfig/20260303-035746-marostegui.json
* 03:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P89636 and previous config saved to /var/cache/conftool/dbconfig/20260303-034239-marostegui.json
* 03:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251', diff saved to https://phabricator.wikimedia.org/P89635 and previous config saved to /var/cache/conftool/dbconfig/20260303-032731-marostegui.json
* 03:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1251 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89634 and previous config saved to /var/cache/conftool/dbconfig/20260303-031224-marostegui.json
* 03:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1251 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89633 and previous config saved to /var/cache/conftool/dbconfig/20260303-030217-marostegui.json
* 03:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1251.eqiad.wmnet with reason: Maintenance
* 02:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1240.eqiad.wmnet with reason: Maintenance
* 02:08 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 08m 00s)
* 02:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1239.eqiad.wmnet with reason: Maintenance
* 02:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89632 and previous config saved to /var/cache/conftool/dbconfig/20260303-020817-marostegui.json
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 01:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P89631 and previous config saved to /var/cache/conftool/dbconfig/20260303-015309-marostegui.json
* 01:42 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mwlog2003.codfw.wmnet with OS trixie
* 01:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235', diff saved to https://phabricator.wikimedia.org/P89630 and previous config saved to /var/cache/conftool/dbconfig/20260303-013802-marostegui.json
* 01:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89629 and previous config saved to /var/cache/conftool/dbconfig/20260303-013719-marostegui.json
* 01:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1235 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89628 and previous config saved to /var/cache/conftool/dbconfig/20260303-012254-marostegui.json
* 01:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P89627 and previous config saved to /var/cache/conftool/dbconfig/20260303-012211-marostegui.json
* 01:19 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwlog2003.codfw.wmnet with reason: host reimage
* 01:11 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mwlog2003.codfw.wmnet with reason: host reimage
* 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1235 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89626 and previous config saved to /var/cache/conftool/dbconfig/20260303-011151-marostegui.json
* 01:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1235.eqiad.wmnet with reason: Maintenance
* 01:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89625 and previous config saved to /var/cache/conftool/dbconfig/20260303-011128-marostegui.json
* 01:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216', diff saved to https://phabricator.wikimedia.org/P89624 and previous config saved to /var/cache/conftool/dbconfig/20260303-010703-marostegui.json
* 00:59 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247189{{!}}Revert "ImageListPager: Properly support file schema migration read new"]] (duration: 08m 12s)
* 00:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P89623 and previous config saved to /var/cache/conftool/dbconfig/20260303-005620-marostegui.json
* 00:56 zabe@deploy2002: zabe: Continuing with sync
* 00:54 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog2003.codfw.wmnet with OS trixie
* 00:53 zabe@deploy2002: zabe: Backport for [[gerrit:1247189{{!}}Revert "ImageListPager: Properly support file schema migration read new"]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:53 herron@cumin1003: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host mwlog2003.codfw.wmnet with OS trixie
* 00:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2216 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89622 and previous config saved to /var/cache/conftool/dbconfig/20260303-005156-marostegui.json
* 00:51 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1247189{{!}}Revert "ImageListPager: Properly support file schema migration read new"]]
* 00:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234', diff saved to https://phabricator.wikimedia.org/P89621 and previous config saved to /var/cache/conftool/dbconfig/20260303-004112-marostegui.json
* 00:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2216 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89620 and previous config saved to /var/cache/conftool/dbconfig/20260303-004056-marostegui.json
* 00:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2216.codfw.wmnet with reason: Maintenance
* 00:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89619 and previous config saved to /var/cache/conftool/dbconfig/20260303-004033-marostegui.json
* 00:31 herron@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host mwlog1003.eqiad.wmnet with OS trixie
* 00:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1234 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89618 and previous config saved to /var/cache/conftool/dbconfig/20260303-002604-marostegui.json
* 00:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P89617 and previous config saved to /var/cache/conftool/dbconfig/20260303-002525-marostegui.json
* 00:20 zabe@deploy2002: helmfile [eqiad] DONE helmfile.d/services/mw-experimental: apply
* 00:18 zabe@deploy2002: helmfile [eqiad] START helmfile.d/services/mw-experimental: apply
* 00:18 zabe@deploy2002: Finished scap sync-world: [[phab:T418327|T418327]] (duration: 05m 01s)
* 00:15 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1234 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89616 and previous config saved to /var/cache/conftool/dbconfig/20260303-001504-marostegui.json
* 00:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1234.eqiad.wmnet with reason: Maintenance
* 00:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89615 and previous config saved to /var/cache/conftool/dbconfig/20260303-001440-marostegui.json
* 00:13 zabe@deploy2002: Started scap sync-world: [[phab:T418327|T418327]]
* 00:11 zabe@deploy2002: zabe: Continuing with sync
* 00:10 zabe@deploy2002: zabe: Backport for [[gerrit:1247068{{!}}ImageListPager: Properly support file schema migration read new (T418327)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 00:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203', diff saved to https://phabricator.wikimedia.org/P89614 and previous config saved to /var/cache/conftool/dbconfig/20260303-001018-marostegui.json
* 00:08 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1247068{{!}}ImageListPager: Properly support file schema migration read new (T418327)]]
== 2026-03-02 ==
* 23:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P89613 and previous config saved to /var/cache/conftool/dbconfig/20260302-235933-marostegui.json
* 23:58 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240320{{!}}Stop writing to il_to on testwiki (T415787)]] (duration: 06m 02s)
* 23:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89612 and previous config saved to /var/cache/conftool/dbconfig/20260302-235511-marostegui.json
* 23:54 zabe@deploy2002: zabe: Continuing with sync
* 23:53 zabe@deploy2002: zabe: Backport for [[gerrit:1240320{{!}}Stop writing to il_to on testwiki (T415787)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:52 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1240320{{!}}Stop writing to il_to on testwiki (T415787)]]
* 23:51 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 7 days, 0:00:00 on cp2058.codfw.wmnet with reason: dcops troubleshooting for [[phab:T418527|T418527]]
* 23:50 zabe@deploy2002: Finished scap sync-world: Backport for [[gerrit:1246880{{!}}multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460)]] (duration: 07m 10s)
* 23:47 zabe@deploy2002: zabe: Continuing with sync
* 23:45 zabe@deploy2002: zabe: Backport for [[gerrit:1246880{{!}}multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 23:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232', diff saved to https://phabricator.wikimedia.org/P89611 and previous config saved to /var/cache/conftool/dbconfig/20260302-234425-marostegui.json
* 23:44 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog2003.codfw.wmnet with OS trixie
* 23:43 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89610 and previous config saved to /var/cache/conftool/dbconfig/20260302-234350-marostegui.json
* 23:43 zabe@deploy2002: Started scap sync-world: Backport for [[gerrit:1246880{{!}}multiversion: Stop setting MW_USE_CONFIG_SCHEMA (T304460)]]
* 23:43 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2203.codfw.wmnet with reason: Maintenance
* 23:35 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2202.codfw.wmnet with reason: Maintenance
* 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89609 and previous config saved to /var/cache/conftool/dbconfig/20260302-233517-marostegui.json
* 23:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1232 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89608 and previous config saved to /var/cache/conftool/dbconfig/20260302-232918-marostegui.json
* 23:25 dwisehaupt@dns1006: END - running authdns-update
* 23:24 dwisehaupt@dns1006: START - running authdns-update
* 23:23 herron@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on mwlog1003.eqiad.wmnet with reason: host reimage
* 23:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P89607 and previous config saved to /var/cache/conftool/dbconfig/20260302-232009-marostegui.json
* 23:18 herron@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on mwlog1003.eqiad.wmnet with reason: host reimage
* 23:17 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1232 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89606 and previous config saved to /var/cache/conftool/dbconfig/20260302-231723-marostegui.json
* 23:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1232.eqiad.wmnet with reason: Maintenance
* 23:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89605 and previous config saved to /var/cache/conftool/dbconfig/20260302-231658-marostegui.json
* 23:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188', diff saved to https://phabricator.wikimedia.org/P89604 and previous config saved to /var/cache/conftool/dbconfig/20260302-230502-marostegui.json
* 23:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P89603 and previous config saved to /var/cache/conftool/dbconfig/20260302-230151-marostegui.json
* 22:57 herron@cumin1003: START - Cookbook sre.hosts.reimage for host mwlog1003.eqiad.wmnet with OS trixie
* 22:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2188 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89602 and previous config saved to /var/cache/conftool/dbconfig/20260302-224954-marostegui.json
* 22:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219', diff saved to https://phabricator.wikimedia.org/P89601 and previous config saved to /var/cache/conftool/dbconfig/20260302-224643-marostegui.json
* 22:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2188 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89600 and previous config saved to /var/cache/conftool/dbconfig/20260302-223612-marostegui.json
* 22:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2188.codfw.wmnet with reason: Maintenance
* 22:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89599 and previous config saved to /var/cache/conftool/dbconfig/20260302-223548-marostegui.json
* 22:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89598 and previous config saved to /var/cache/conftool/dbconfig/20260302-223135-marostegui.json
* 22:21 maryum: Deployed security fix for [[phab:T418179|T418179]]
* 22:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P89597 and previous config saved to /var/cache/conftool/dbconfig/20260302-222041-marostegui.json
* 22:19 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1219 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89596 and previous config saved to /var/cache/conftool/dbconfig/20260302-221938-marostegui.json
* 22:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1219.eqiad.wmnet with reason: Maintenance
* 22:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89595 and previous config saved to /var/cache/conftool/dbconfig/20260302-221925-marostegui.json
* 22:10 aaron@deploy2002: Finished scap sync-world: Backport for [[gerrit:1242613{{!}}Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470)]] (duration: 06m 39s)
* 22:06 aaron@deploy2002: aaron: Continuing with sync
* 22:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176', diff saved to https://phabricator.wikimedia.org/P89594 and previous config saved to /var/cache/conftool/dbconfig/20260302-220533-marostegui.json
* 22:05 aaron@deploy2002: aaron: Backport for [[gerrit:1242613{{!}}Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 22:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P89593 and previous config saved to /var/cache/conftool/dbconfig/20260302-220418-marostegui.json
* 22:03 aaron@deploy2002: Started scap sync-world: Backport for [[gerrit:1242613{{!}}Add growthexperiments.v0 to $wgRestSandboxSpecs (T414470)]]
* 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-backup2003.codfw.wmnet with OS trixie
* 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-backup2004.codfw.wmnet with OS trixie
* 22:03 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 22:03 jhancock@cumin2002: END (FAIL) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=99) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 22:02 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 22:01 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247149{{!}}ApiCSPReport: Use structured logging for CSP reports]] (duration: 08m 19s)
* 21:57 catrope@deploy2002: catrope: Continuing with sync
* 21:56 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jhancock@cumin2002"
* 21:55 catrope@deploy2002: catrope: Backport for [[gerrit:1247149{{!}}ApiCSPReport: Use structured logging for CSP reports]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:53 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1247149{{!}}ApiCSPReport: Use structured logging for CSP reports]]
* 21:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2176 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89592 and previous config saved to /var/cache/conftool/dbconfig/20260302-215025-marostegui.json
* 21:50 brett@cumin2002: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on cp2043.codfw.wmnet with reason: These are test instances, failing should not notif
* 21:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218', diff saved to https://phabricator.wikimedia.org/P89591 and previous config saved to /var/cache/conftool/dbconfig/20260302-214910-marostegui.json
* 21:48 inflatador: bking@desktop restarting wdqs codfw to clear ProbeDown alerts
* 21:43 brett@cumin2002: END (FAIL) - Cookbook sre.hosts.reboot-single (exit_code=1) for host cp2043.codfw.wmnet
* 21:42 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup2004.codfw.wmnet with reason: host reimage
* 21:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2176 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89590 and previous config saved to /var/cache/conftool/dbconfig/20260302-213957-marostegui.json
* 21:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2176.codfw.wmnet with reason: Maintenance
* 21:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89589 and previous config saved to /var/cache/conftool/dbconfig/20260302-213934-marostegui.json
* 21:38 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-backup2003.codfw.wmnet with reason: host reimage
* 21:36 eevans@cumin1003: END (PASS) - Cookbook sre.cassandra.roll-restart (exit_code=0) for nodes matching A:cassandra-dev: Testing removal of OpenJDK 8 support - eevans@cumin1003
* 21:34 catrope@deploy2002: Finished scap sync-world: Backport for [[gerrit:1226024{{!}}Add Comments namespace for shnwikinews (T414403)]] (duration: 07m 07s)
* 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89588 and previous config saved to /var/cache/conftool/dbconfig/20260302-213402-marostegui.json
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup2004.codfw.wmnet with reason: host reimage
* 21:33 jhancock@cumin2002: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-backup2003.codfw.wmnet with reason: host reimage
* 21:30 brett@cumin2002: START - Cookbook sre.hosts.reboot-single for host cp2043.codfw.wmnet
* 21:30 catrope@deploy2002: shivaanshsingh, catrope: Continuing with sync
* 21:29 catrope@deploy2002: shivaanshsingh, catrope: Backport for [[gerrit:1226024{{!}}Add Comments namespace for shnwikinews (T414403)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:27 catrope@deploy2002: Started scap sync-world: Backport for [[gerrit:1226024{{!}}Add Comments namespace for shnwikinews (T414403)]]
* 21:24 kemayo@deploy2002: Finished scap sync-world: Backport for [[gerrit:1243990{{!}}Suggestion Mode: add values for suggestion feedback properties (T401739)]], [[gerrit:1240721{{!}}Stop PasteCheck A/B test (T417429)]] (duration: 10m 55s)
* 21:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P89587 and previous config saved to /var/cache/conftool/dbconfig/20260302-212426-marostegui.json
* 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89586 and previous config saved to /var/cache/conftool/dbconfig/20260302-212345-marostegui.json
* 21:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1218.eqiad.wmnet with reason: Maintenance
* 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89585 and previous config saved to /var/cache/conftool/dbconfig/20260302-212321-marostegui.json
* 21:20 kemayo@deploy2002: esanders, kemayo, caro: Continuing with sync
* 21:17 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-backup2004.codfw.wmnet with OS trixie
* 21:16 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-backup2003.codfw.wmnet with OS trixie
* 21:16 eevans@cumin1003: START - Cookbook sre.cassandra.roll-restart for nodes matching A:cassandra-dev: Testing removal of OpenJDK 8 support - eevans@cumin1003
* 21:16 jhancock@cumin2002: END (FAIL) - Cookbook sre.hardware.upgrade-firmware (exit_code=1) upgrade firmware for hosts ['ms-backup2003']
* 21:15 jhancock@cumin2002: START - Cookbook sre.hardware.upgrade-firmware upgrade firmware for hosts ['ms-backup2003']
* 21:15 kemayo@deploy2002: esanders, kemayo, caro: Backport for [[gerrit:1243990{{!}}Suggestion Mode: add values for suggestion feedback properties (T401739)]], [[gerrit:1240721{{!}}Stop PasteCheck A/B test (T417429)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:14 inflatador: bking@apt1002 reprepro --component thirdparty/opensearch3 update trixie-wikimedia [[phab:T418388|T418388]]
* 21:13 kemayo@deploy2002: Started scap sync-world: Backport for [[gerrit:1243990{{!}}Suggestion Mode: add values for suggestion feedback properties (T401739)]], [[gerrit:1240721{{!}}Stop PasteCheck A/B test (T417429)]]
* 21:12 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-backup2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:11 jhancock@cumin2002: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host ms-backup2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 21:10 dani@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247107{{!}}Undeploy Comparative Reader Research survey on eswiki (T417834)]], [[gerrit:1247105{{!}}Undeploy Comparative Reader Research survey on enwiki (T417829)]] (duration: 06m 52s)
* 21:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174', diff saved to https://phabricator.wikimedia.org/P89584 and previous config saved to /var/cache/conftool/dbconfig/20260302-210919-marostegui.json
* 21:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P89583 and previous config saved to /var/cache/conftool/dbconfig/20260302-210813-marostegui.json
* 21:06 dani@deploy2002: dani: Continuing with sync
* 21:05 dani@deploy2002: dani: Backport for [[gerrit:1247107{{!}}Undeploy Comparative Reader Research survey on eswiki (T417834)]], [[gerrit:1247105{{!}}Undeploy Comparative Reader Research survey on enwiki (T417829)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 21:03 dani@deploy2002: Started scap sync-world: Backport for [[gerrit:1247107{{!}}Undeploy Comparative Reader Research survey on eswiki (T417834)]], [[gerrit:1247105{{!}}Undeploy Comparative Reader Research survey on enwiki (T417829)]]
* 20:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-backup2004.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:58 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host ms-backup2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-backup2004
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-backup2004
* 20:56 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-backup2003
* 20:56 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host ms-backup2003
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 20:54 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-backup2003 to codfw - jhancock@cumin2002"
* 20:54 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding ms-backup2003 to codfw - jhancock@cumin2002"
* 20:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89582 and previous config saved to /var/cache/conftool/dbconfig/20260302-205411-marostegui.json
* 20:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206', diff saved to https://phabricator.wikimedia.org/P89581 and previous config saved to /var/cache/conftool/dbconfig/20260302-205307-marostegui.json
* 20:50 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 20:42 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89580 and previous config saved to /var/cache/conftool/dbconfig/20260302-204136-marostegui.json
* 20:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2174.codfw.wmnet with reason: Maintenance
* 20:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89579 and previous config saved to /var/cache/conftool/dbconfig/20260302-204112-marostegui.json
* 20:41 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:39 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89578 and previous config saved to /var/cache/conftool/dbconfig/20260302-203759-marostegui.json
* 20:37 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:36 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:33 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:31 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:30 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 20:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1206 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89577 and previous config saved to /var/cache/conftool/dbconfig/20260302-202740-marostegui.json
* 20:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1206.eqiad.wmnet with reason: Maintenance
* 20:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89576 and previous config saved to /var/cache/conftool/dbconfig/20260302-202716-marostegui.json
* 20:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P89575 and previous config saved to /var/cache/conftool/dbconfig/20260302-202604-marostegui.json
* 20:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P89574 and previous config saved to /var/cache/conftool/dbconfig/20260302-201209-marostegui.json
* 20:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173', diff saved to https://phabricator.wikimedia.org/P89573 and previous config saved to /var/cache/conftool/dbconfig/20260302-201057-marostegui.json
* 20:01 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 20:00 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=93) for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 19:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196', diff saved to https://phabricator.wikimedia.org/P89572 and previous config saved to /var/cache/conftool/dbconfig/20260302-195702-marostegui.json
* 19:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2173 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89571 and previous config saved to /var/cache/conftool/dbconfig/20260302-195549-marostegui.json
* 19:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2173 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89570 and previous config saved to /var/cache/conftool/dbconfig/20260302-194435-marostegui.json
* 19:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2173.codfw.wmnet with reason: Maintenance
* 19:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89569 and previous config saved to /var/cache/conftool/dbconfig/20260302-194411-marostegui.json
* 19:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1196 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89568 and previous config saved to /var/cache/conftool/dbconfig/20260302-194155-marostegui.json
* 19:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1196 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89566 and previous config saved to /var/cache/conftool/dbconfig/20260302-193119-marostegui.json
* 19:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1013,1017].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 19:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1196.eqiad.wmnet with reason: Maintenance
* 19:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89565 and previous config saved to /var/cache/conftool/dbconfig/20260302-193046-marostegui.json
* 19:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P89564 and previous config saved to /var/cache/conftool/dbconfig/20260302-192903-marostegui.json
* 19:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P89563 and previous config saved to /var/cache/conftool/dbconfig/20260302-191539-marostegui.json
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170', diff saved to https://phabricator.wikimedia.org/P89562 and previous config saved to /var/cache/conftool/dbconfig/20260302-191355-marostegui.json
* 19:12 dzahn@deploy2002: helmfile [staging] DONE helmfile.d/services/miscweb: apply
* 19:12 dzahn@deploy2002: helmfile [staging] START helmfile.d/services/miscweb: apply
* 19:04 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.reimage (exit_code=99) for host ms-be2095.codfw.wmnet with OS bullseye
* 19:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195', diff saved to https://phabricator.wikimedia.org/P89561 and previous config saved to /var/cache/conftool/dbconfig/20260302-190032-marostegui.json
* 18:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89560 and previous config saved to /var/cache/conftool/dbconfig/20260302-185848-marostegui.json
* 18:54 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 18:53 andrew@cumin2002: END (ERROR) - Cookbook sre.hosts.reimage (exit_code=97) for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89559 and previous config saved to /var/cache/conftool/dbconfig/20260302-184832-marostegui.json
* 18:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2170.codfw.wmnet with reason: Maintenance
* 18:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89558 and previous config saved to /var/cache/conftool/dbconfig/20260302-184808-marostegui.json
* 18:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89557 and previous config saved to /var/cache/conftool/dbconfig/20260302-184524-marostegui.json
* 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89556 and previous config saved to /var/cache/conftool/dbconfig/20260302-183449-marostegui.json
* 18:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1195.eqiad.wmnet with reason: Maintenance
* 18:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89555 and previous config saved to /var/cache/conftool/dbconfig/20260302-183425-marostegui.json
* 18:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P89554 and previous config saved to /var/cache/conftool/dbconfig/20260302-183300-marostegui.json
* 18:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P89553 and previous config saved to /var/cache/conftool/dbconfig/20260302-181918-marostegui.json
* 18:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153', diff saved to https://phabricator.wikimedia.org/P89552 and previous config saved to /var/cache/conftool/dbconfig/20260302-181753-marostegui.json
* 18:16 andrew@cumin2002: START - Cookbook sre.hosts.reimage for host cloudcephosd2008-dev.codfw.wmnet with OS trixie
* 18:04 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186', diff saved to https://phabricator.wikimedia.org/P89551 and previous config saved to /var/cache/conftool/dbconfig/20260302-180411-marostegui.json
* 18:03 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 18:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2153 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89550 and previous config saved to /var/cache/conftool/dbconfig/20260302-180245-marostegui.json
* 18:01 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:54 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:53 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
* 17:53 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
* 17:52 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:52 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:52 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2153 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89549 and previous config saved to /var/cache/conftool/dbconfig/20260302-174917-marostegui.json
* 17:49 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 17:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2153.codfw.wmnet with reason: Maintenance
* 17:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1186 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89548 and previous config saved to /var/cache/conftool/dbconfig/20260302-174903-marostegui.json
* 17:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89547 and previous config saved to /var/cache/conftool/dbconfig/20260302-174854-marostegui.json
* 17:46 jhancock@cumin2002: END (FAIL) - Cookbook sre.hosts.provision (exit_code=99) for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:44 jhancock@cumin2002: START - Cookbook sre.hosts.provision for host contint2003.mgmt.codfw.wmnet with chassis set policy FORCE_RESTART and with Dell SCP reboot policy FORCED
* 17:44 jhancock@cumin2002: START - Cookbook sre.hosts.reimage for host ms-be2095.codfw.wmnet with OS bullseye
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host contint2003
* 17:43 jhancock@cumin2002: START - Cookbook sre.network.configure-switch-interfaces for host contint2003
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 17:43 jhancock@cumin2002: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:42 jhancock@cumin2002: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: adding contint2003 to codfw - jhancock@cumin2002"
* 17:39 jhancock@cumin2002: START - Cookbook sre.dns.netbox
* 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1186 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89546 and previous config saved to /var/cache/conftool/dbconfig/20260302-173827-marostegui.json
* 17:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1186.eqiad.wmnet with reason: Maintenance
* 17:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89545 and previous config saved to /var/cache/conftool/dbconfig/20260302-173803-marostegui.json
* 17:37 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 17:36 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 17:34 fceratto@cumin1003: END (PASS) - Cookbook sre.mysql.update-replication (exit_code=0)
* 17:33 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 17:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P89544 and previous config saved to /var/cache/conftool/dbconfig/20260302-173347-marostegui.json
* 17:32 fceratto@cumin1003: END (FAIL) - Cookbook sre.mysql.update-replication (exit_code=99)
* 17:32 fceratto@cumin1003: START - Cookbook sre.mysql.update-replication
* 17:24 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 17:23 ebernhardson@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/opensearch-semantic-search: apply
* 17:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P89543 and previous config saved to /var/cache/conftool/dbconfig/20260302-172256-marostegui.json
* 17:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146', diff saved to https://phabricator.wikimedia.org/P89542 and previous config saved to /var/cache/conftool/dbconfig/20260302-171839-marostegui.json
* 17:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184', diff saved to https://phabricator.wikimedia.org/P89541 and previous config saved to /var/cache/conftool/dbconfig/20260302-170748-marostegui.json
* 17:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2146 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89540 and previous config saved to /var/cache/conftool/dbconfig/20260302-170331-marostegui.json
* 16:52 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host db2230.codfw.wmnet with OS trixie
* 16:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1184 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89539 and previous config saved to /var/cache/conftool/dbconfig/20260302-165240-marostegui.json
* 16:51 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2146 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89538 and previous config saved to /var/cache/conftool/dbconfig/20260302-165153-marostegui.json
* 16:51 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2146.codfw.wmnet with reason: Maintenance
* 16:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89537 and previous config saved to /var/cache/conftool/dbconfig/20260302-165129-marostegui.json
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1184 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89536 and previous config saved to /var/cache/conftool/dbconfig/20260302-164141-marostegui.json
* 16:41 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1184.eqiad.wmnet with reason: Maintenance
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89535 and previous config saved to /var/cache/conftool/dbconfig/20260302-164118-marostegui.json
* 16:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P89534 and previous config saved to /var/cache/conftool/dbconfig/20260302-163622-marostegui.json
* 16:29 fceratto@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on db2230.codfw.wmnet with reason: host reimage
* 16:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P89533 and previous config saved to /var/cache/conftool/dbconfig/20260302-162610-marostegui.json
* 16:21 fceratto@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on db2230.codfw.wmnet with reason: host reimage
* 16:21 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145', diff saved to https://phabricator.wikimedia.org/P89532 and previous config saved to /var/cache/conftool/dbconfig/20260302-162115-marostegui.json
* 16:19 moritzm: installing PAM security updates on Bookworm
* 16:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169', diff saved to https://phabricator.wikimedia.org/P89531 and previous config saved to /var/cache/conftool/dbconfig/20260302-161102-marostegui.json
* 16:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2145 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89530 and previous config saved to /var/cache/conftool/dbconfig/20260302-160607-marostegui.json
* 16:05 fceratto@cumin1003: START - Cookbook sre.hosts.reimage for host db2230.codfw.wmnet with OS trixie
* 15:56 moritzm: installing glibc bugfix updates from trixie point release
* 15:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1169 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89529 and previous config saved to /var/cache/conftool/dbconfig/20260302-155555-marostegui.json
* 15:55 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2145 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89528 and previous config saved to /var/cache/conftool/dbconfig/20260302-155527-marostegui.json
* 15:55 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2145.codfw.wmnet with reason: Maintenance
* 15:46 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2141.codfw.wmnet with reason: Maintenance
* 15:45 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1169.eqiad.wmnet
* 15:45 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1169 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89527 and previous config saved to /var/cache/conftool/dbconfig/20260302-154520-marostegui.json
* 15:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1169.eqiad.wmnet with reason: Maintenance
* 15:38 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'edit-check' for release 'main' .
* 15:34 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1169.eqiad.wmnet
* 15:33 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1167.eqiad.wmnet
* 15:32 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 15:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2165.codfw.wmnet with reason: Maintenance
* 15:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1209.eqiad.wmnet with reason: Maintenance
* 15:31 marostegui@cumin1003: dbctl commit (dc=all): 'Restore db1226 full weight after schema change', diff saved to https://phabricator.wikimedia.org/P89526 and previous config saved to /var/cache/conftool/dbconfig/20260302-153100-marostegui.json
* 15:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P89525 and previous config saved to /var/cache/conftool/dbconfig/20260302-152334-marostegui.json
* 15:22 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1167.eqiad.wmnet
* 15:22 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1166.eqiad.wmnet
* 15:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2198.codfw.wmnet with reason: Maintenance
* 15:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89524 and previous config saved to /var/cache/conftool/dbconfig/20260302-151838-marostegui.json
* 15:10 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1166.eqiad.wmnet
* 15:10 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1165.eqiad.wmnet
* 15:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226', diff saved to https://phabricator.wikimedia.org/P89523 and previous config saved to /var/cache/conftool/dbconfig/20260302-150826-marostegui.json
* 15:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P89522 and previous config saved to /var/cache/conftool/dbconfig/20260302-150330-marostegui.json
* 15:00 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.reimage (exit_code=0) for host ms-be1097.eqiad.wmnet with OS bullseye
* 15:00 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1165.eqiad.wmnet
* 14:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1164.eqiad.wmnet
* 14:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1226 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89520 and previous config saved to /var/cache/conftool/dbconfig/20260302-145318-marostegui.json
* 14:48 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1164.eqiad.wmnet
* 14:48 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1163.eqiad.wmnet
* 14:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195', diff saved to https://phabricator.wikimedia.org/P89519 and previous config saved to /var/cache/conftool/dbconfig/20260302-144823-marostegui.json
* 14:41 Lucas_WMDE: UTC afternoon backport+config window done
* 14:40 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247057{{!}}IPInfo: Set log level to "info" (T374718)]] (duration: 08m 01s)
* 14:37 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1163.eqiad.wmnet
* 14:36 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1162.eqiad.wmnet
* 14:36 lucaswerkmeister-wmde@deploy2002: kharlan, lucaswerkmeister-wmde: Continuing with sync
* 14:36 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1226 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89517 and previous config saved to /var/cache/conftool/dbconfig/20260302-143608-marostegui.json
* 14:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1226.eqiad.wmnet with reason: Maintenance
* 14:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89516 and previous config saved to /var/cache/conftool/dbconfig/20260302-143544-marostegui.json
* 14:34 lucaswerkmeister-wmde@deploy2002: kharlan, lucaswerkmeister-wmde: Backport for [[gerrit:1247057{{!}}IPInfo: Set log level to "info" (T374718)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89515 and previous config saved to /var/cache/conftool/dbconfig/20260302-143315-marostegui.json
* 14:32 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [[gerrit:1247057{{!}}IPInfo: Set log level to "info" (T374718)]]
* 14:31 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 14:30 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [[gerrit:1245364{{!}}Add configurations for graphql usage survey and its pipeline tests (T414476)]] (duration: 09m 44s)
* 14:27 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.hosts.reimage: Host reimage - jclark@cumin1003"
* 14:26 lucaswerkmeister-wmde@deploy2002: itamar, lucaswerkmeister-wmde: Continuing with sync
* 14:26 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 14:25 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1162.eqiad.wmnet
* 14:25 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1161.eqiad.wmnet
* 14:23 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 14:22 lucaswerkmeister-wmde@deploy2002: itamar, lucaswerkmeister-wmde: Backport for [[gerrit:1245364{{!}}Add configurations for graphql usage survey and its pipeline tests (T414476)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:20 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [[gerrit:1245364{{!}}Add configurations for graphql usage survey and its pipeline tests (T414476)]]
* 14:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P89514 and previous config saved to /var/cache/conftool/dbconfig/20260302-142037-marostegui.json
* 14:19 elukey@cumin1003: START - Cookbook sre.hosts.provision for host ms-fe1013.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART and with Dell SCP reboot policy GRACEFUL
* 14:18 lucaswerkmeister-wmde@deploy2002: mwscript-k8s job started: namespaceDupes lawiki --fix # [[phab:T418706|T418706]]
* 14:18 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2195 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89513 and previous config saved to /var/cache/conftool/dbconfig/20260302-141834-marostegui.json
* 14:18 elukey@puppetserver1001: conftool action : set/pooled=no; selector: name=ms-fe1013.eqiad.wmnet
* 14:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2195.codfw.wmnet with reason: Maintenance
* 14:18 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host people2004.codfw.wmnet
* 14:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89512 and previous config saved to /var/cache/conftool/dbconfig/20260302-141810-marostegui.json
* 14:17 lucaswerkmeister-wmde@deploy2002: Finished scap sync-world: Backport for [[gerrit:1247063{{!}}lawiki: add Adumbratio (draft) namespace (T418706)]] (duration: 07m 27s)
* 14:14 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host people2004.codfw.wmnet
* 14:13 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, anzx: Continuing with sync
* 14:13 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 14:13 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1161.eqiad.wmnet
* 14:13 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1160.eqiad.wmnet
* 14:13 moritzm: installing libcap2 updates from Trixie point release
* 14:12 lucaswerkmeister-wmde@deploy2002: lucaswerkmeister-wmde, anzx: Backport for [[gerrit:1247063{{!}}lawiki: add Adumbratio (draft) namespace (T418706)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 14:11 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:11 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 14:10 elukey@deploy2002: helmfile [staging-codfw] DONE helmfile.d/admin 'sync'.
* 14:10 lucaswerkmeister-wmde@deploy2002: Started scap sync-world: Backport for [[gerrit:1247063{{!}}lawiki: add Adumbratio (draft) namespace (T418706)]]
* 14:10 elukey@deploy2002: helmfile [staging-codfw] START helmfile.d/admin 'sync'.
* 14:08 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1028.eqiad.wmnet
* 14:05 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host idp-test1005.wikimedia.org
* 14:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214', diff saved to https://phabricator.wikimedia.org/P89511 and previous config saved to /var/cache/conftool/dbconfig/20260302-140529-marostegui.json
* 14:04 jclark@cumin1003: END (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 2:00:00 on ms-be1097.eqiad.wmnet with reason: host reimage
* 14:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P89510 and previous config saved to /var/cache/conftool/dbconfig/20260302-140302-marostegui.json
* 14:02 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1028.eqiad.wmnet
* 14:01 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1160.eqiad.wmnet
* 14:01 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host idp-test1005.wikimedia.org
* 14:01 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1159.eqiad.wmnet
* 14:00 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1025.eqiad.wmnet
* 13:57 jclark@cumin1003: START - Cookbook sre.hosts.downtime for 2:00:00 on ms-be1097.eqiad.wmnet with reason: host reimage
* 13:54 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1025.eqiad.wmnet
* 13:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1214 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89509 and previous config saved to /var/cache/conftool/dbconfig/20260302-135021-marostegui.json
* 13:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181', diff saved to https://phabricator.wikimedia.org/P89508 and previous config saved to /var/cache/conftool/dbconfig/20260302-134754-marostegui.json
* 13:47 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1159.eqiad.wmnet
* 13:47 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1158.eqiad.wmnet
* 13:40 jclark@cumin1003: START - Cookbook sre.hosts.reimage for host ms-be1097.eqiad.wmnet with OS bullseye
* 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1097
* 13:38 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1097
* 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.dns.netbox (exit_code=0)
* 13:38 jclark@cumin1003: END (PASS) - Cookbook sre.puppet.sync-netbox-hiera (exit_code=0) generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt ms-be1097 - jclark@cumin1003"
* 13:37 jclark@cumin1003: START - Cookbook sre.puppet.sync-netbox-hiera generate netbox hiera data: "Triggered by cookbooks.sre.dns.netbox: update network and mgmt ms-be1097 - jclark@cumin1003"
* 13:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1158.eqiad.wmnet
* 13:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1157.eqiad.wmnet
* 13:35 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 13:35 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1214 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89507 and previous config saved to /var/cache/conftool/dbconfig/20260302-133503-marostegui.json
* 13:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1214.eqiad.wmnet with reason: Maintenance
* 13:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89506 and previous config saved to /var/cache/conftool/dbconfig/20260302-133440-marostegui.json
* 13:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2181 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89505 and previous config saved to /var/cache/conftool/dbconfig/20260302-133247-marostegui.json
* 13:28 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 13:27 jclark@cumin1003: START - Cookbook sre.dns.netbox
* 13:27 jclark@cumin1003: END (PASS) - Cookbook sre.network.configure-switch-interfaces (exit_code=0) for host ms-be1097
* 13:26 jclark@cumin1003: START - Cookbook sre.network.configure-switch-interfaces for host ms-be1097
* 13:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1157.eqiad.wmnet
* 13:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1156.eqiad.wmnet
* 13:22 brouberol: Running `echo 'https://turnilo-next.wikimedia.org' {{!}} mwscript-k8s --attach -- purgeList.php`
* 13:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P89504 and previous config saved to /var/cache/conftool/dbconfig/20260302-131932-marostegui.json
* 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2181 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89503 and previous config saved to /var/cache/conftool/dbconfig/20260302-131653-marostegui.json
* 13:16 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2181.codfw.wmnet with reason: Maintenance
* 13:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89502 and previous config saved to /var/cache/conftool/dbconfig/20260302-131630-marostegui.json
* 13:15 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host dse-k8s-worker1024.eqiad.wmnet
* 13:14 moritzm: installing libcap2 updates from Bookworm point release
* 13:12 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1156.eqiad.wmnet
* 13:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1155.eqiad.wmnet
* 13:08 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host dse-k8s-worker1024.eqiad.wmnet
* 13:07 ladsgroup@cumin1003: END (FAIL) - Cookbook sre.wikireplicas.update-views (exit_code=99)
* 13:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203', diff saved to https://phabricator.wikimedia.org/P89500 and previous config saved to /var/cache/conftool/dbconfig/20260302-130424-marostegui.json
* 13:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P89499 and previous config saved to /var/cache/conftool/dbconfig/20260302-130122-marostegui.json
* 13:00 ladsgroup@cumin1003: START - Cookbook sre.wikireplicas.update-views
* 12:58 jayme@cumin1003: END (PASS) - Cookbook sre.k8s.pool-depool-node (exit_code=0) pool for host wikikube-worker2356.codfw.wmnet
* 12:58 jayme@cumin1003: START - Cookbook sre.k8s.pool-depool-node pool for host wikikube-worker2356.codfw.wmnet
* 12:58 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1155.eqiad.wmnet
* 12:58 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1154.eqiad.wmnet
* 12:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89498 and previous config saved to /var/cache/conftool/dbconfig/20260302-124917-marostegui.json
* 12:46 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1154.eqiad.wmnet
* 12:46 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1153.eqiad.wmnet
* 12:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167', diff saved to https://phabricator.wikimedia.org/P89497 and previous config saved to /var/cache/conftool/dbconfig/20260302-124615-marostegui.json
* 12:35 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1153.eqiad.wmnet
* 12:35 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1152.eqiad.wmnet
* 12:32 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1203 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89494 and previous config saved to /var/cache/conftool/dbconfig/20260302-123253-marostegui.json
* 12:32 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1203.eqiad.wmnet with reason: Maintenance
* 12:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89493 and previous config saved to /var/cache/conftool/dbconfig/20260302-123229-marostegui.json
* 12:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89492 and previous config saved to /var/cache/conftool/dbconfig/20260302-123108-marostegui.json
* 12:27 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:26 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:24 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1152.eqiad.wmnet
* 12:24 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1151.eqiad.wmnet
* 12:23 kevinbazira@deploy2002: helmfile [ml-staging-codfw] Ran 'sync' command on namespace 'experimental' for release 'main' .
* 12:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/turnilo: apply
* 12:20 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/turnilo: apply
* 12:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P89491 and previous config saved to /var/cache/conftool/dbconfig/20260302-121722-marostegui.json
* 12:15 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89490 and previous config saved to /var/cache/conftool/dbconfig/20260302-121525-marostegui.json
* 12:15 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2167.codfw.wmnet with reason: Maintenance
* 12:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89489 and previous config saved to /var/cache/conftool/dbconfig/20260302-121501-marostegui.json
* 12:12 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1151.eqiad.wmnet
* 12:12 btullis@cumin1003: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host an-worker1150.eqiad.wmnet
* 12:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193', diff saved to https://phabricator.wikimedia.org/P89488 and previous config saved to /var/cache/conftool/dbconfig/20260302-120214-marostegui.json
* 12:00 btullis@cumin1003: START - Cookbook sre.hosts.reboot-single for host an-worker1150.eqiad.wmnet
* 11:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P89487 and previous config saved to /var/cache/conftool/dbconfig/20260302-115953-marostegui.json
* 11:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1193 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89486 and previous config saved to /var/cache/conftool/dbconfig/20260302-114706-marostegui.json
* 11:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166', diff saved to https://phabricator.wikimedia.org/P89485 and previous config saved to /var/cache/conftool/dbconfig/20260302-114446-marostegui.json
* 11:30 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1193 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89484 and previous config saved to /var/cache/conftool/dbconfig/20260302-113034-marostegui.json
* 11:30 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host ganeti-jumbo1001.eqiad.wmnet
* 11:30 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1193.eqiad.wmnet with reason: Maintenance
* 11:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89483 and previous config saved to /var/cache/conftool/dbconfig/20260302-113010-marostegui.json
* 11:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2166 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89482 and previous config saved to /var/cache/conftool/dbconfig/20260302-112937-marostegui.json
* 11:25 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host ganeti-jumbo1001.eqiad.wmnet
* 11:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P89481 and previous config saved to /var/cache/conftool/dbconfig/20260302-111502-marostegui.json
* 11:13 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2166 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89480 and previous config saved to /var/cache/conftool/dbconfig/20260302-111351-marostegui.json
* 11:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2166.codfw.wmnet with reason: Maintenance
* 11:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89479 and previous config saved to /var/cache/conftool/dbconfig/20260302-111327-marostegui.json
* 11:03 jmm@cumin2002: END (PASS) - Cookbook sre.hosts.reboot-single (exit_code=0) for host cumin2003.codfw.wmnet
* 10:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192', diff saved to https://phabricator.wikimedia.org/P89478 and previous config saved to /var/cache/conftool/dbconfig/20260302-105955-marostegui.json
* 10:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P89477 and previous config saved to /var/cache/conftool/dbconfig/20260302-105818-marostegui.json
* 10:57 jmm@cumin2002: START - Cookbook sre.hosts.reboot-single for host cumin2003.codfw.wmnet
* 10:55 brouberol@deploy2002: helmfile [dse-k8s-codfw] DONE helmfile.d/admin 'apply'.
* 10:54 brouberol@deploy2002: helmfile [dse-k8s-codfw] START helmfile.d/admin 'apply'.
* 10:53 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/admin 'apply'.
* 10:52 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/admin 'apply'.
* 10:51 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthboo-next: apply
* 10:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook-next: apply
* 10:50 brouberol@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/growthbook: apply
* 10:49 brouberol@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/growthbook: apply
* 10:46 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-text_magru and A:cp - 3.0 upgrade ()
* 10:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89476 and previous config saved to /var/cache/conftool/dbconfig/20260302-104446-marostegui.json
* 10:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2164', diff saved to https://phabricator.wikimedia.org/P89475 and previous config saved to /var/cache/conftool/dbconfig/20260302-104310-marostegui.json
* 10:28 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89474 and previous config saved to /var/cache/conftool/dbconfig/20260302-102825-marostegui.json
* 10:28 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1192.eqiad.wmnet with reason: Maintenance
* 10:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89473 and previous config saved to /var/cache/conftool/dbconfig/20260302-102800-marostegui.json
* 10:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P89472 and previous config saved to /var/cache/conftool/dbconfig/20260302-101252-marostegui.json
* 10:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2164 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89471 and previous config saved to /var/cache/conftool/dbconfig/20260302-101200-marostegui.json
* 10:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2164.codfw.wmnet with reason: Maintenance
* 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89470 and previous config saved to /var/cache/conftool/dbconfig/20260302-101135-marostegui.json
* 10:08 moritzm: installing intel-microcode bugfix updates on Bookworm hosts
* 09:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178', diff saved to https://phabricator.wikimedia.org/P89469 and previous config saved to /var/cache/conftool/dbconfig/20260302-095744-marostegui.json
* 09:57 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-text_magru and A:cp - 3.0 upgrade ()
* 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P89468 and previous config saved to /var/cache/conftool/dbconfig/20260302-095627-marostegui.json
* 09:55 fabfur: start upgrading haproxy to 3.0 on A:cp-text_magru ([[phab:T417253|T417253]])
* 09:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89467 and previous config saved to /var/cache/conftool/dbconfig/20260302-094236-marostegui.json
* 09:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163', diff saved to https://phabricator.wikimedia.org/P89466 and previous config saved to /var/cache/conftool/dbconfig/20260302-094118-marostegui.json
* 09:35 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:35 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:34 moritzm: installing gnu TLS security updates
* 09:34 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] DONE helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:33 javiermonton@deploy2002: helmfile [dse-k8s-eqiad] START helmfile.d/dse-k8s-services/mw-page-html-content-change-enrich-next: apply
* 09:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2163 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89465 and previous config saved to /var/cache/conftool/dbconfig/20260302-092610-marostegui.json
* 09:26 mlitn@deploy2002: Finished scap sync-world: Backport for [[gerrit:1245265{{!}}Limit additional whitespace to sticky header version only (T416598)]] (duration: 11m 02s)
* 09:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89464 and previous config saved to /var/cache/conftool/dbconfig/20260302-092600-marostegui.json
* 09:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1178.eqiad.wmnet with reason: Maintenance
* 09:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89463 and previous config saved to /var/cache/conftool/dbconfig/20260302-092535-marostegui.json
* 09:21 mlitn@deploy2002: mlitn: Continuing with sync
* 09:16 mlitn@deploy2002: mlitn: Backport for [[gerrit:1245265{{!}}Limit additional whitespace to sticky header version only (T416598)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 09:15 mlitn@deploy2002: Started scap sync-world: Backport for [[gerrit:1245265{{!}}Limit additional whitespace to sticky header version only (T416598)]]
* 09:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P89462 and previous config saved to /var/cache/conftool/dbconfig/20260302-091027-marostegui.json
* 09:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2163 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89461 and previous config saved to /var/cache/conftool/dbconfig/20260302-091003-marostegui.json
* 09:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2163.codfw.wmnet with reason: Maintenance
* 09:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89460 and previous config saved to /var/cache/conftool/dbconfig/20260302-090938-marostegui.json
* 09:08 kharlan@deploy2002: Finished scap sync-world: Backport for [[gerrit:1246904{{!}}HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477)]] (duration: 16m 09s)
* 09:02 kharlan@deploy2002: kharlan: Continuing with sync
* 08:57 kharlan@deploy2002: kharlan: Backport for [[gerrit:1246904{{!}}HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177', diff saved to https://phabricator.wikimedia.org/P89459 and previous config saved to /var/cache/conftool/dbconfig/20260302-085519-marostegui.json
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P89458 and previous config saved to /var/cache/conftool/dbconfig/20260302-085430-marostegui.json
* 08:51 kharlan@deploy2002: Started scap sync-world: Backport for [[gerrit:1246904{{!}}HCaptchaEnterpriseHealthChecker: Add configurable retry count and delay (T418477)]]
* 08:48 fabfur@cumin1003: END (PASS) - Cookbook sre.cdn.roll-upgrade-haproxy (exit_code=0) rolling upgrade of HAProxy on A:cp-upload_magru and A:cp - 3.0 upgrade ()
* 08:47 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1029.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:45 moritzm: installing libxml2 security updates
* 08:44 kgraessle@deploy2002: Finished scap sync-world: Backport for [[gerrit:1240672{{!}}Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485)]] (duration: 37m 12s)
* 08:42 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dbproxy1029.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:41 elukey@cumin1003: END (PASS) - Cookbook sre.hosts.provision (exit_code=0) for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1177 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89457 and previous config saved to /var/cache/conftool/dbconfig/20260302-084010-marostegui.json
* 08:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161', diff saved to https://phabricator.wikimedia.org/P89456 and previous config saved to /var/cache/conftool/dbconfig/20260302-083922-marostegui.json
* 08:31 kgraessle@deploy2002: kgraessle: Continuing with sync
* 08:30 kgraessle@deploy2002: kgraessle: Backport for [[gerrit:1240672{{!}}Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485)]] synced to the testservers (see https://wikitech.wikimedia.org/wiki/Mwdebug). Changes can now be verified there.
* 08:30 elukey@cumin1003: START - Cookbook sre.hosts.provision for host dbproxy1028.mgmt.eqiad.wmnet with chassis set policy GRACEFUL_RESTART
* 08:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89455 and previous config saved to /var/cache/conftool/dbconfig/20260302-082414-marostegui.json
* 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1177 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89454 and previous config saved to /var/cache/conftool/dbconfig/20260302-082333-marostegui.json
* 08:23 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1177.eqiad.wmnet with reason: Maintenance
* 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89453 and previous config saved to /var/cache/conftool/dbconfig/20260302-082309-marostegui.json
* 08:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbproxy1028.eqiad.wmnet with reason: Maintenance
* 08:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbproxy1029.eqiad.wmnet with reason: Maintenance
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89452 and previous config saved to /var/cache/conftool/dbconfig/20260302-080813-marostegui.json
* 08:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2161.codfw.wmnet with reason: Maintenance
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P89451 and previous config saved to /var/cache/conftool/dbconfig/20260302-080800-marostegui.json
* 08:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89450 and previous config saved to /var/cache/conftool/dbconfig/20260302-080748-marostegui.json
* 08:07 kgraessle@deploy2002: Started scap sync-world: Backport for [[gerrit:1240672{{!}}Enable revert risk filters for first batch of wikis: < 1000 monthly edits (T411485)]]
* 08:05 fabfur@cumin1003: START - Cookbook sre.cdn.roll-upgrade-haproxy rolling upgrade of HAProxy on A:cp-upload_magru and A:cp - 3.0 upgrade ()
* 08:05 fabfur: start upgrading haproxy to 3.0 on A:cp-upload_magru ([[phab:T417253|T417253]])
* 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172', diff saved to https://phabricator.wikimedia.org/P89449 and previous config saved to /var/cache/conftool/dbconfig/20260302-075252-marostegui.json
* 07:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P89448 and previous config saved to /var/cache/conftool/dbconfig/20260302-075241-marostegui.json
* 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89447 and previous config saved to /var/cache/conftool/dbconfig/20260302-073745-marostegui.json
* 07:37 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154', diff saved to https://phabricator.wikimedia.org/P89446 and previous config saved to /var/cache/conftool/dbconfig/20260302-073732-marostegui.json
* 07:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2154 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89445 and previous config saved to /var/cache/conftool/dbconfig/20260302-072224-marostegui.json
* 07:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1172 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89444 and previous config saved to /var/cache/conftool/dbconfig/20260302-072058-marostegui.json
* 07:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1172.eqiad.wmnet with reason: Maintenance
* 07:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89443 and previous config saved to /var/cache/conftool/dbconfig/20260302-070523-marostegui.json
* 07:05 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2154 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89442 and previous config saved to /var/cache/conftool/dbconfig/20260302-070512-marostegui.json
* 07:05 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2154.codfw.wmnet with reason: Maintenance
* 07:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89441 and previous config saved to /var/cache/conftool/dbconfig/20260302-070447-marostegui.json
* 07:01 marostegui@cumin1003: END (PASS) - Cookbook sre.mysql.pool (exit_code=0) pool db1244: After schema change
* 06:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P89439 and previous config saved to /var/cache/conftool/dbconfig/20260302-065014-marostegui.json
* 06:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P89438 and previous config saved to /var/cache/conftool/dbconfig/20260302-064938-marostegui.json
* 06:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167', diff saved to https://phabricator.wikimedia.org/P89436 and previous config saved to /var/cache/conftool/dbconfig/20260302-063506-marostegui.json
* 06:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152', diff saved to https://phabricator.wikimedia.org/P89435 and previous config saved to /var/cache/conftool/dbconfig/20260302-063430-marostegui.json
* 06:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89433 and previous config saved to /var/cache/conftool/dbconfig/20260302-061957-marostegui.json
* 06:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2152 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89432 and previous config saved to /var/cache/conftool/dbconfig/20260302-061922-marostegui.json
* 06:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1 day, 0:00:00 on db2240.codfw.wmnet with reason: Maintenance
* 06:16 marostegui@cumin1003: START - Cookbook sre.mysql.pool pool db1244: After schema change
* 06:15 marostegui@dns1004: END - running authdns-update
* 06:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depool db2240 [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89430 and previous config saved to /var/cache/conftool/dbconfig/20260302-061428-marostegui.json
* 06:13 marostegui@dns1004: START - running authdns-update
* 06:13 marostegui@cumin1003: dbctl commit (dc=all): 'Promote db2179 to s4 primary and set section read-write [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89429 and previous config saved to /var/cache/conftool/dbconfig/20260302-061316-marostegui.json
* 06:12 marostegui@cumin1003: dbctl commit (dc=all): 'Set s4 codfw as read-only for maintenance - [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89428 and previous config saved to /var/cache/conftool/dbconfig/20260302-061252-marostegui.json
* 06:06 marostegui: Starting s4 codfw failover from db2240 to db2179 - [[phab:T418080|T418080]]
* 06:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 1:00:00 on 42 hosts with reason: Primary switchover s4 [[phab:T418080|T418080]]
* 06:03 marostegui@cumin1003: dbctl commit (dc=all): 'Set db2179 with weight 0 [[phab:T418080|T418080]]', diff saved to https://phabricator.wikimedia.org/P89427 and previous config saved to /var/cache/conftool/dbconfig/20260302-060317-marostegui.json
* 06:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1167 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89426 and previous config saved to /var/cache/conftool/dbconfig/20260302-060317-marostegui.json
* 06:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 06:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1167.eqiad.wmnet with reason: Maintenance
* 06:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2152 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89425 and previous config saved to /var/cache/conftool/dbconfig/20260302-060245-marostegui.json
* 06:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2152.codfw.wmnet with reason: Maintenance
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1181.eqiad.wmnet with reason: Maintenance
* 05:58 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2220.codfw.wmnet with reason: Maintenance
* 02:14 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 13s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
* 00:50 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 00:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89424 and previous config saved to /var/cache/conftool/dbconfig/20260302-004950-marostegui.json
* 00:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P89423 and previous config saved to /var/cache/conftool/dbconfig/20260302-003441-marostegui.json
* 00:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253', diff saved to https://phabricator.wikimedia.org/P89422 and previous config saved to /var/cache/conftool/dbconfig/20260302-001933-marostegui.json
* 00:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1253 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89421 and previous config saved to /var/cache/conftool/dbconfig/20260302-000425-marostegui.json
* 00:02 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1253 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89420 and previous config saved to /var/cache/conftool/dbconfig/20260302-000208-marostegui.json
* 00:02 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1253.eqiad.wmnet with reason: Maintenance
* 00:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89419 and previous config saved to /var/cache/conftool/dbconfig/20260302-000143-marostegui.json
== 2026-03-01 ==
* 23:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P89418 and previous config saved to /var/cache/conftool/dbconfig/20260301-234635-marostegui.json
* 23:35 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89417 and previous config saved to /var/cache/conftool/dbconfig/20260301-233524-marostegui.json
* 23:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236', diff saved to https://phabricator.wikimedia.org/P89416 and previous config saved to /var/cache/conftool/dbconfig/20260301-233127-marostegui.json
* 23:20 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P89415 and previous config saved to /var/cache/conftool/dbconfig/20260301-232016-marostegui.json
* 23:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89414 and previous config saved to /var/cache/conftool/dbconfig/20260301-231619-marostegui.json
* 23:14 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1236 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89413 and previous config saved to /var/cache/conftool/dbconfig/20260301-231404-marostegui.json
* 23:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1236.eqiad.wmnet with reason: Maintenance
* 23:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89412 and previous config saved to /var/cache/conftool/dbconfig/20260301-231339-marostegui.json
* 23:05 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222', diff saved to https://phabricator.wikimedia.org/P89411 and previous config saved to /var/cache/conftool/dbconfig/20260301-230508-marostegui.json
* 22:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P89410 and previous config saved to /var/cache/conftool/dbconfig/20260301-225832-marostegui.json
* 22:50 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2222 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89409 and previous config saved to /var/cache/conftool/dbconfig/20260301-224959-marostegui.json
* 22:44 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2222 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89408 and previous config saved to /var/cache/conftool/dbconfig/20260301-224451-marostegui.json
* 22:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2222.codfw.wmnet with reason: Maintenance
* 22:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89407 and previous config saved to /var/cache/conftool/dbconfig/20260301-224426-marostegui.json
* 22:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231', diff saved to https://phabricator.wikimedia.org/P89406 and previous config saved to /var/cache/conftool/dbconfig/20260301-224324-marostegui.json
* 22:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P89405 and previous config saved to /var/cache/conftool/dbconfig/20260301-222919-marostegui.json
* 22:28 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1231 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89404 and previous config saved to /var/cache/conftool/dbconfig/20260301-222815-marostegui.json
* 22:26 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1231 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89403 and previous config saved to /var/cache/conftool/dbconfig/20260301-222600-marostegui.json
* 22:25 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1231.eqiad.wmnet with reason: Maintenance
* 22:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89402 and previous config saved to /var/cache/conftool/dbconfig/20260301-222536-marostegui.json
* 22:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221', diff saved to https://phabricator.wikimedia.org/P89401 and previous config saved to /var/cache/conftool/dbconfig/20260301-221410-marostegui.json
* 22:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P89400 and previous config saved to /var/cache/conftool/dbconfig/20260301-221027-marostegui.json
* 21:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89399 and previous config saved to /var/cache/conftool/dbconfig/20260301-215902-marostegui.json
* 21:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227', diff saved to https://phabricator.wikimedia.org/P89398 and previous config saved to /var/cache/conftool/dbconfig/20260301-215519-marostegui.json
* 21:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2221 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89397 and previous config saved to /var/cache/conftool/dbconfig/20260301-215404-marostegui.json
* 21:53 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2221.codfw.wmnet with reason: Maintenance
* 21:53 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89396 and previous config saved to /var/cache/conftool/dbconfig/20260301-215339-marostegui.json
* 21:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1227 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89395 and previous config saved to /var/cache/conftool/dbconfig/20260301-214011-marostegui.json
* 21:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P89394 and previous config saved to /var/cache/conftool/dbconfig/20260301-213831-marostegui.json
* 21:34 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1227 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89393 and previous config saved to /var/cache/conftool/dbconfig/20260301-213410-marostegui.json
* 21:34 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1227.eqiad.wmnet with reason: Maintenance
* 21:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89392 and previous config saved to /var/cache/conftool/dbconfig/20260301-213346-marostegui.json
* 21:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218', diff saved to https://phabricator.wikimedia.org/P89391 and previous config saved to /var/cache/conftool/dbconfig/20260301-212323-marostegui.json
* 21:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P89390 and previous config saved to /var/cache/conftool/dbconfig/20260301-211837-marostegui.json
* 21:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89389 and previous config saved to /var/cache/conftool/dbconfig/20260301-210815-marostegui.json
* 21:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202', diff saved to https://phabricator.wikimedia.org/P89388 and previous config saved to /var/cache/conftool/dbconfig/20260301-210329-marostegui.json
* 21:03 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2218 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89387 and previous config saved to /var/cache/conftool/dbconfig/20260301-210309-marostegui.json
* 21:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2218.codfw.wmnet with reason: Maintenance
* 21:02 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89386 and previous config saved to /var/cache/conftool/dbconfig/20260301-210244-marostegui.json
* 20:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1202 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89385 and previous config saved to /var/cache/conftool/dbconfig/20260301-204820-marostegui.json
* 20:47 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P89384 and previous config saved to /var/cache/conftool/dbconfig/20260301-204736-marostegui.json
* 20:46 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1202 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89383 and previous config saved to /var/cache/conftool/dbconfig/20260301-204606-marostegui.json
* 20:45 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1202.eqiad.wmnet with reason: Maintenance
* 20:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89382 and previous config saved to /var/cache/conftool/dbconfig/20260301-204541-marostegui.json
* 20:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208', diff saved to https://phabricator.wikimedia.org/P89381 and previous config saved to /var/cache/conftool/dbconfig/20260301-203227-marostegui.json
* 20:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P89380 and previous config saved to /var/cache/conftool/dbconfig/20260301-203033-marostegui.json
* 20:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2208 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89379 and previous config saved to /var/cache/conftool/dbconfig/20260301-201720-marostegui.json
* 20:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194', diff saved to https://phabricator.wikimedia.org/P89378 and previous config saved to /var/cache/conftool/dbconfig/20260301-201525-marostegui.json
* 20:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2208 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89377 and previous config saved to /var/cache/conftool/dbconfig/20260301-201212-marostegui.json
* 20:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2208.codfw.wmnet with reason: Maintenance
* 20:08 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2200.codfw.wmnet with reason: Maintenance
* 20:04 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2198.codfw.wmnet with reason: Maintenance
* 20:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89376 and previous config saved to /var/cache/conftool/dbconfig/20260301-200422-marostegui.json
* 20:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1194 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89375 and previous config saved to /var/cache/conftool/dbconfig/20260301-200016-marostegui.json
* 19:58 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1194 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89374 and previous config saved to /var/cache/conftool/dbconfig/20260301-195803-marostegui.json
* 19:57 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1194.eqiad.wmnet with reason: Maintenance
* 19:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89373 and previous config saved to /var/cache/conftool/dbconfig/20260301-195738-marostegui.json
* 19:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P89372 and previous config saved to /var/cache/conftool/dbconfig/20260301-194914-marostegui.json
* 19:42 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P89371 and previous config saved to /var/cache/conftool/dbconfig/20260301-194230-marostegui.json
* 19:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182', diff saved to https://phabricator.wikimedia.org/P89370 and previous config saved to /var/cache/conftool/dbconfig/20260301-193406-marostegui.json
* 19:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191', diff saved to https://phabricator.wikimedia.org/P89369 and previous config saved to /var/cache/conftool/dbconfig/20260301-192721-marostegui.json
* 19:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2182 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89368 and previous config saved to /var/cache/conftool/dbconfig/20260301-191858-marostegui.json
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2182 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89367 and previous config saved to /var/cache/conftool/dbconfig/20260301-191340-marostegui.json
* 19:13 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2182.codfw.wmnet with reason: Maintenance
* 19:13 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89366 and previous config saved to /var/cache/conftool/dbconfig/20260301-191315-marostegui.json
* 19:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1191 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89365 and previous config saved to /var/cache/conftool/dbconfig/20260301-191213-marostegui.json
* 19:10 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1191 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89364 and previous config saved to /var/cache/conftool/dbconfig/20260301-190958-marostegui.json
* 19:09 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1191.eqiad.wmnet with reason: Maintenance
* 19:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89363 and previous config saved to /var/cache/conftool/dbconfig/20260301-190934-marostegui.json
* 18:58 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P89362 and previous config saved to /var/cache/conftool/dbconfig/20260301-185807-marostegui.json
* 18:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P89361 and previous config saved to /var/cache/conftool/dbconfig/20260301-185425-marostegui.json
* 18:43 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168', diff saved to https://phabricator.wikimedia.org/P89360 and previous config saved to /var/cache/conftool/dbconfig/20260301-184259-marostegui.json
* 18:39 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174', diff saved to https://phabricator.wikimedia.org/P89359 and previous config saved to /var/cache/conftool/dbconfig/20260301-183917-marostegui.json
* 18:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2168 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89358 and previous config saved to /var/cache/conftool/dbconfig/20260301-182750-marostegui.json
* 18:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89357 and previous config saved to /var/cache/conftool/dbconfig/20260301-182409-marostegui.json
* 18:22 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2168 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89356 and previous config saved to /var/cache/conftool/dbconfig/20260301-182238-marostegui.json
* 18:22 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2168.codfw.wmnet with reason: Maintenance
* 18:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89355 and previous config saved to /var/cache/conftool/dbconfig/20260301-182213-marostegui.json
* 18:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1174 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89354 and previous config saved to /var/cache/conftool/dbconfig/20260301-182153-marostegui.json
* 18:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1174.eqiad.wmnet with reason: Maintenance
* 18:18 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1171.eqiad.wmnet with reason: Maintenance
* 18:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89353 and previous config saved to /var/cache/conftool/dbconfig/20260301-181818-marostegui.json
* 18:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P89352 and previous config saved to /var/cache/conftool/dbconfig/20260301-180705-marostegui.json
* 18:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P89351 and previous config saved to /var/cache/conftool/dbconfig/20260301-180310-marostegui.json
* 17:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159', diff saved to https://phabricator.wikimedia.org/P89350 and previous config saved to /var/cache/conftool/dbconfig/20260301-175157-marostegui.json
* 17:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170', diff saved to https://phabricator.wikimedia.org/P89349 and previous config saved to /var/cache/conftool/dbconfig/20260301-174802-marostegui.json
* 17:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89348 and previous config saved to /var/cache/conftool/dbconfig/20260301-173649-marostegui.json
* 17:32 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89347 and previous config saved to /var/cache/conftool/dbconfig/20260301-173253-marostegui.json
* 17:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89346 and previous config saved to /var/cache/conftool/dbconfig/20260301-173134-marostegui.json
* 17:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2159.codfw.wmnet with reason: Maintenance
* 17:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89345 and previous config saved to /var/cache/conftool/dbconfig/20260301-173110-marostegui.json
* 17:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1170 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89344 and previous config saved to /var/cache/conftool/dbconfig/20260301-172742-marostegui.json
* 17:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1170.eqiad.wmnet with reason: Maintenance
* 17:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89343 and previous config saved to /var/cache/conftool/dbconfig/20260301-172717-marostegui.json
* 17:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P89342 and previous config saved to /var/cache/conftool/dbconfig/20260301-171602-marostegui.json
* 17:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P89341 and previous config saved to /var/cache/conftool/dbconfig/20260301-171210-marostegui.json
* 17:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150', diff saved to https://phabricator.wikimedia.org/P89340 and previous config saved to /var/cache/conftool/dbconfig/20260301-170053-marostegui.json
* 16:57 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158', diff saved to https://phabricator.wikimedia.org/P89339 and previous config saved to /var/cache/conftool/dbconfig/20260301-165701-marostegui.json
* 16:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2150 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89338 and previous config saved to /var/cache/conftool/dbconfig/20260301-164545-marostegui.json
* 16:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1158 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89337 and previous config saved to /var/cache/conftool/dbconfig/20260301-164153-marostegui.json
* 16:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2150 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89336 and previous config saved to /var/cache/conftool/dbconfig/20260301-164022-marostegui.json
* 16:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2150.codfw.wmnet with reason: Maintenance
* 16:39 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1158 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89335 and previous config saved to /var/cache/conftool/dbconfig/20260301-163938-marostegui.json
* 16:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1014,1018].eqiad.wmnet,db1155.eqiad.wmnet with reason: Maintenance
* 16:39 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1158.eqiad.wmnet with reason: Maintenance
* 16:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1230.eqiad.wmnet with reason: Maintenance
* 16:36 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2213.codfw.wmnet with reason: Maintenance
* 12:22 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89334 and previous config saved to /var/cache/conftool/dbconfig/20260301-122201-marostegui.json
* 12:06 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P89333 and previous config saved to /var/cache/conftool/dbconfig/20260301-120652-marostegui.json
* 11:51 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228', diff saved to https://phabricator.wikimedia.org/P89332 and previous config saved to /var/cache/conftool/dbconfig/20260301-115144-marostegui.json
* 11:36 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2228 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89331 and previous config saved to /var/cache/conftool/dbconfig/20260301-113636-marostegui.json
* 11:31 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2228 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89330 and previous config saved to /var/cache/conftool/dbconfig/20260301-113156-marostegui.json
* 11:31 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2228.codfw.wmnet with reason: Maintenance
* 11:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89329 and previous config saved to /var/cache/conftool/dbconfig/20260301-113131-marostegui.json
* 11:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on dbstore1008.eqiad.wmnet with reason: Maintenance
* 11:19 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1245.eqiad.wmnet with reason: Maintenance
* 11:17 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1216.eqiad.wmnet with reason: Maintenance
* 11:17 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89328 and previous config saved to /var/cache/conftool/dbconfig/20260301-111658-marostegui.json
* 11:16 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P89327 and previous config saved to /var/cache/conftool/dbconfig/20260301-111622-marostegui.json
* 11:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P89326 and previous config saved to /var/cache/conftool/dbconfig/20260301-110151-marostegui.json
* 11:01 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223', diff saved to https://phabricator.wikimedia.org/P89325 and previous config saved to /var/cache/conftool/dbconfig/20260301-110114-marostegui.json
* 10:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210', diff saved to https://phabricator.wikimedia.org/P89324 and previous config saved to /var/cache/conftool/dbconfig/20260301-104642-marostegui.json
* 10:46 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2223 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89323 and previous config saved to /var/cache/conftool/dbconfig/20260301-104606-marostegui.json
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2223 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89322 and previous config saved to /var/cache/conftool/dbconfig/20260301-104024-marostegui.json
* 10:40 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2223.codfw.wmnet with reason: Maintenance
* 10:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89321 and previous config saved to /var/cache/conftool/dbconfig/20260301-103958-marostegui.json
* 10:31 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89320 and previous config saved to /var/cache/conftool/dbconfig/20260301-103134-marostegui.json
* 10:27 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1210 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89319 and previous config saved to /var/cache/conftool/dbconfig/20260301-102727-marostegui.json
* 10:27 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1210.eqiad.wmnet with reason: Maintenance
* 10:27 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89318 and previous config saved to /var/cache/conftool/dbconfig/20260301-102702-marostegui.json
* 10:24 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P89317 and previous config saved to /var/cache/conftool/dbconfig/20260301-102450-marostegui.json
* 10:11 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P89316 and previous config saved to /var/cache/conftool/dbconfig/20260301-101154-marostegui.json
* 10:09 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211', diff saved to https://phabricator.wikimedia.org/P89315 and previous config saved to /var/cache/conftool/dbconfig/20260301-100942-marostegui.json
* 09:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207', diff saved to https://phabricator.wikimedia.org/P89314 and previous config saved to /var/cache/conftool/dbconfig/20260301-095645-marostegui.json
* 09:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2211 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89313 and previous config saved to /var/cache/conftool/dbconfig/20260301-095434-marostegui.json
* 09:48 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2211 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89312 and previous config saved to /var/cache/conftool/dbconfig/20260301-094847-marostegui.json
* 09:48 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2211.codfw.wmnet with reason: Maintenance
* 09:44 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2201.codfw.wmnet with reason: Maintenance
* 09:44 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89311 and previous config saved to /var/cache/conftool/dbconfig/20260301-094432-marostegui.json
* 09:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1207 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89310 and previous config saved to /var/cache/conftool/dbconfig/20260301-094137-marostegui.json
* 09:38 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1207 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89309 and previous config saved to /var/cache/conftool/dbconfig/20260301-093835-marostegui.json
* 09:38 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1207.eqiad.wmnet with reason: Maintenance
* 09:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89308 and previous config saved to /var/cache/conftool/dbconfig/20260301-093810-marostegui.json
* 09:29 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P89307 and previous config saved to /var/cache/conftool/dbconfig/20260301-092923-marostegui.json
* 09:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P89306 and previous config saved to /var/cache/conftool/dbconfig/20260301-092302-marostegui.json
* 09:14 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192', diff saved to https://phabricator.wikimedia.org/P89305 and previous config saved to /var/cache/conftool/dbconfig/20260301-091415-marostegui.json
* 09:07 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200', diff saved to https://phabricator.wikimedia.org/P89304 and previous config saved to /var/cache/conftool/dbconfig/20260301-090754-marostegui.json
* 08:59 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89303 and previous config saved to /var/cache/conftool/dbconfig/20260301-085907-marostegui.json
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2192 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89302 and previous config saved to /var/cache/conftool/dbconfig/20260301-085427-marostegui.json
* 08:54 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2192.codfw.wmnet with reason: Maintenance
* 08:54 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89301 and previous config saved to /var/cache/conftool/dbconfig/20260301-085403-marostegui.json
* 08:52 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1200 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89300 and previous config saved to /var/cache/conftool/dbconfig/20260301-085246-marostegui.json
* 08:49 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1200 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89299 and previous config saved to /var/cache/conftool/dbconfig/20260301-084952-marostegui.json
* 08:49 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1200.eqiad.wmnet with reason: Maintenance
* 08:49 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89298 and previous config saved to /var/cache/conftool/dbconfig/20260301-084928-marostegui.json
* 08:38 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P89297 and previous config saved to /var/cache/conftool/dbconfig/20260301-083855-marostegui.json
* 08:34 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P89296 and previous config saved to /var/cache/conftool/dbconfig/20260301-083420-marostegui.json
* 08:23 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178', diff saved to https://phabricator.wikimedia.org/P89295 and previous config saved to /var/cache/conftool/dbconfig/20260301-082346-marostegui.json
* 08:19 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185', diff saved to https://phabricator.wikimedia.org/P89294 and previous config saved to /var/cache/conftool/dbconfig/20260301-081912-marostegui.json
* 08:08 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2178 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89293 and previous config saved to /var/cache/conftool/dbconfig/20260301-080838-marostegui.json
* 08:04 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1185 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89292 and previous config saved to /var/cache/conftool/dbconfig/20260301-080404-marostegui.json
* 08:03 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2178.codfw.wmnet with reason: Maintenance
* 08:03 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89291 and previous config saved to /var/cache/conftool/dbconfig/20260301-080341-marostegui.json
* 08:01 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1185 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89290 and previous config saved to /var/cache/conftool/dbconfig/20260301-080110-marostegui.json
* 08:01 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1185.eqiad.wmnet with reason: Maintenance
* 08:00 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89289 and previous config saved to /var/cache/conftool/dbconfig/20260301-080044-marostegui.json
* 07:48 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P89288 and previous config saved to /var/cache/conftool/dbconfig/20260301-074833-marostegui.json
* 07:45 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P89287 and previous config saved to /var/cache/conftool/dbconfig/20260301-074536-marostegui.json
* 07:33 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171', diff saved to https://phabricator.wikimedia.org/P89286 and previous config saved to /var/cache/conftool/dbconfig/20260301-073324-marostegui.json
* 07:30 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161', diff saved to https://phabricator.wikimedia.org/P89285 and previous config saved to /var/cache/conftool/dbconfig/20260301-073028-marostegui.json
* 07:18 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2171 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89284 and previous config saved to /var/cache/conftool/dbconfig/20260301-071816-marostegui.json
* 07:15 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89283 and previous config saved to /var/cache/conftool/dbconfig/20260301-071521-marostegui.json
* 07:12 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2171 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89282 and previous config saved to /var/cache/conftool/dbconfig/20260301-071226-marostegui.json
* 07:12 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2171.codfw.wmnet with reason: Maintenance
* 07:12 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89281 and previous config saved to /var/cache/conftool/dbconfig/20260301-071201-marostegui.json
* 07:11 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1161 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89280 and previous config saved to /var/cache/conftool/dbconfig/20260301-071113-marostegui.json
* 07:11 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 10:00:00 on an-redacteddb1001.eqiad.wmnet,clouddb[1016,1020].eqiad.wmnet,db1154.eqiad.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1161.eqiad.wmnet with reason: Maintenance
* 07:10 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89279 and previous config saved to /var/cache/conftool/dbconfig/20260301-071040-marostegui.json
* 06:56 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P89278 and previous config saved to /var/cache/conftool/dbconfig/20260301-065653-marostegui.json
* 06:55 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P89277 and previous config saved to /var/cache/conftool/dbconfig/20260301-065531-marostegui.json
* 06:41 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157', diff saved to https://phabricator.wikimedia.org/P89276 and previous config saved to /var/cache/conftool/dbconfig/20260301-064145-marostegui.json
* 06:40 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159', diff saved to https://phabricator.wikimedia.org/P89275 and previous config saved to /var/cache/conftool/dbconfig/20260301-064023-marostegui.json
* 06:26 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db2157 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89274 and previous config saved to /var/cache/conftool/dbconfig/20260301-062636-marostegui.json
* 06:25 marostegui@cumin1003: dbctl commit (dc=all): 'Repooling after maintenance db1159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89273 and previous config saved to /var/cache/conftool/dbconfig/20260301-062515-marostegui.json
* 06:21 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db1159 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89272 and previous config saved to /var/cache/conftool/dbconfig/20260301-062108-marostegui.json
* 06:21 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db1159.eqiad.wmnet with reason: Maintenance
* 06:20 marostegui@cumin1003: dbctl commit (dc=all): 'Depooling db2157 ([[phab:T418465|T418465]])', diff saved to https://phabricator.wikimedia.org/P89271 and previous config saved to /var/cache/conftool/dbconfig/20260301-062047-marostegui.json
* 06:20 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2157.codfw.wmnet with reason: Maintenance
* 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 5:00:00 on db2207.codfw.wmnet with reason: Maintenance
* 06:14 marostegui@cumin1003: DONE (PASS) - Cookbook sre.hosts.downtime (exit_code=0) for 8:00:00 on db1222.eqiad.wmnet with reason: Maintenance
* 02:13 mwpresync@deploy2002: Finished scap build-images: Publishing wmf/next image (duration: 13m 00s)
* 02:00 mwpresync@deploy2002: Started scap build-images: Publishing wmf/next image
== Other archives ==
See [[Server Admin Log/Archives]].
<noinclude>
[[Category:SAL]]
[[Category:Operations]]
</noinclude>
genv5y0ekpx4t5stea9m6ifw2xyu9jk
User:Naresh Krishna Raja
2
16361
2396611
121938
2026-03-29T08:13:24Z
Minorax
38339
2396611
wikitext
text/x-wiki
*Hi Friends Iam <span style="color:red;">Naresh Krishna Raja</span> <span style="color:black;">it's on [[:w:User:Naresh Krishna Raja|Wikipedia]].</span>
*Hailis from Vijayawada. 19 years old of the student.
*For me the Contact: For directly Please send me an [[Special:Emailuser/Naresh Krishna Raja|Email]] or Contact that My [[User talk:Naresh Krishna Raja|talkpage]].
*Please see on the My Wiki [[:mw:User:Naresh Krishna Raja|Wikimedia]], [[:m:User:Naresh Krishna Raja|Meta]].
fbma8cxnshj6g1oksylm4ozoe1t4d23
Template:Toolforge nav
10
54864
2396613
2372511
2026-03-29T08:14:29Z
Minorax
38339
2396613
wikitext
text/x-wiki
{{Navigation sidebar
| name = Toolforge nav
| title = [[Help:Toolforge|Toolforge]]
| image = [[File:Toolforge logo.svg|center|50px]]
| content1 =
<inputbox>
type=fulltext
searchfilter=incategory:Toolforge
width=40
placeholder=Search Toolforge documentation
searchbuttonlabel=Search
break=no
</inputbox>
| content2 =
* [[Help:Cloud Services introduction|Cloud Services overview]]
* [[Help:Toolforge|Toolforge user docs]]
* [[Portal:Toolforge/Changelog|Toolforge changelog]]
| heading3 = Get started
| content3 =
* [[Help:Toolforge/Quickstart|Quickstart: set up and get access]]
* [[Portal:Toolforge/About Toolforge|How Toolforge works]]
* [[Help:Toolforge/Terms and conditions|Rules you must follow]]
* [[:Category:Tutorials|Tutorials]]
| heading4 = Build and run tools
| content4 =
* [[Help:Toolforge/Tool accounts|Navigate tool accounts and files]]
* [[Help:Toolforge/Building container images|Build container images for tools]]
* [[Help:Toolforge/Web|Run a web service]]
* [[Help:Toolforge/Running jobs|Schedule and manage jobs]]
* [[Help:Toolforge/Envvars|Manage tool runtime configuration (envvars)]]
* [[Help:Toolforge/Deploy your tool|Deploy your tool on every push (beta)]]
* Language-specific details:
** [[Help:Toolforge/Python|Python]]
** [[Help:Toolforge/Running Pywikibot scripts|Pywikibot]]
** [[Help:Toolforge/Node.js|Node.js]]
** [[Help:Toolforge/PHP|PHP]]
** [[:Category:How-to-guide|...more languages/frameworks]]
* [[Help:Toolforge/Redis|Use Redis for caching]]
* [[Help:Toolforge/Elasticsearch|Index content with Elasticsearch]]
| heading5 = Access shared storage and databases
| content5 =
* [[Help:Shared storage|Access shared storage and public wiki dumps]]
* [[Help:Toolforge/Database|Access the Wiki Replicas databases]]
* [[Help:CirrusSearch OpenSearch replicas|Access replica search indices]]
* Manage [[Help:Toolforge/Database#User databases|tool databases]]
* [[Help:Toolforge/Email|Sending and receiving email]] as tools
| heading7 = Share and maintain tools
| content7 =
* [[Help:Toolforge/Version control|Set up version control and code review]]
* [[Help:Toolforge/Developing successful tools|Develop successful tools]]
* [[:toolhub:|Find and share tools on Toolhub]]
* [[Help:Toolforge/Tool accounts#Delete a tool account|Delete a tool]]
| heading8 = Get help
| content8 =
* [[Help:Cloud Services communication|How and where to get help]]
* [[Help:Toolforge/Troubleshooting|Troubleshooting]]
* [[Portal:Toolforge/Contributing|Contribute to Toolforge]]
| heading9 = Useful links
| content9 =
* [[Portal:Toolforge/Admin|Toolforge admin docs]]
* [[:Category:Toolforge tools|List of tools]]
* [https://toolsadmin.wikimedia.org/ Toolforge Admin Console (toolsadmin)]
* [[Help:Toolforge/API|Toolforge API]]
}}
<includeonly>{{#ifeq:{{{nocat|}}}||{{#switch:{{NAMESPACE}}
|Obsolete = [[Category:Toolforge archive|{{SUBPAGENAME}}]]
|Help|Portal| = [[Category:Toolforge|{{#switch:{{ROOTPAGENAME}}|
Toolforge = {{#titleparts:{{FULLPAGENAME}}||2}}
|News = {{FULLPAGENAME}}
|#default = {{SUBPAGENAME}}
}}]]
}}}}</includeonly><noinclude>
{{documentation}}
[[Category:Cloud Services templates]]
</noinclude>
nc8byqxwjo4nsnf6f2up2z2fcopsmqv
Map of database maintenance
0
449160
2396607
2396598
2026-03-29T00:02:42Z
Dexbot
30554
Bot: Updating the report
2396607
wikitext
text/x-wiki
{{/Header}}
== Today (2026-03-29) ==
== Yesterday (2026-03-28) ==
== Last seven days ==
[[Category:MariaDB]]
it8rb5vv4qxz3155dwcsgh77sbrur9d
Template:Navigation sidebar
10
450252
2396612
2101089
2026-03-29T08:14:15Z
Minorax
38339
2396612
wikitext
text/x-wiki
<templatestyles src="Navigation sidebar/styles.css"/><div role="navigation" class="navigation-not-searchable tpl-navsidebar {{#switch:{{{float|}}}|left|right|none=tpl-navsidebar-float{{{float}}}}} {{{class|}}}" style="{{{style|}}}">{{#if: {{{topimage|<noinclude>demo</noinclude>}}}|
<div class="tpl-navsidebar-topimage">{{{topimage}}}</div>
}}{{#if: {{{title|<noinclude>demo</noinclude>}}}|
<p class="tpl-navsidebar-title">{{{title}}}</p>
}}{{#if: {{{image|<noinclude>demo</noinclude>}}}|
<div class="tpl-navsidebar-image">{{{image}}}</div>
}}{{#if: {{{above|<noinclude>demo</noinclude>}}}|
<div class="tpl-navsidebar-content">
{{{above}}}
</div>
}}<div class="tpl-navsidebar-contents"><!--
-->{{#if: {{{heading1|}}}{{{content1|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading1|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading1}}}</p>}}
{{{content1}}}
</div>
}}{{#if: {{{heading2|}}}{{{content2|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading2|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading2}}}</p>}}
{{{content2}}}
</div>
}}{{#if: {{{heading3|}}}{{{content3|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading3|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading3}}}</p>}}
{{{content3}}}
</div>
}}{{#if: {{{heading4|}}}{{{content4|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading4|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading4}}}</p>}}
{{{content4}}}
</div>
}}{{#if: {{{heading5|}}}{{{content5|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading5|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading5}}}</p>}}
{{{content5}}}
</div>
}}{{#if: {{{heading6|}}}{{{content6|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading6|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading6}}}</p>}}
{{{content6}}}
</div>
}}{{#if: {{{heading7|}}}{{{content7|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading7|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading7}}}</p>}}
{{{content7}}}
</div>
}}{{#if: {{{heading8|}}}{{{content8|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading8|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading8}}}</p>}}
{{{content8}}}
</div>
}}{{#if: {{{heading9|}}}{{{content9|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading9|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading9}}}</p>}}
{{{content9}}}
</div>
}}{{#if: {{{heading10|}}}{{{content10|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading10|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading10}}}</p>}}
{{{content10}}}
</div>
}}{{#if: {{{heading11|}}}{{{content11|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading11|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading11}}}</p>}}
{{{content11}}}
</div>
}}{{#if: {{{heading12|}}}{{{content12|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading12|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading12}}}</p>}}
{{{content12}}}
</div>
}}{{#if: {{{heading13|}}}{{{content13|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading13|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading13}}}</p>}}
{{{content13}}}
</div>
}}{{#if: {{{heading14|}}}{{{content14|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading14|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading14}}}</p>}}
{{{content14}}}
</div>
}}{{#if: {{{heading15|}}}{{{content15|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading15|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading15}}}</p>}}
{{{content15}}}
</div>
}}{{#if: {{{heading16|}}}{{{content16|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading16|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading16}}}</p>}}
{{{content16}}}
</div>
}}{{#if: {{{heading17|}}}{{{content17|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading17|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading17}}}</p>}}
{{{content17}}}
</div>
}}{{#if: {{{heading18|}}}{{{content18|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading18|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading18}}}</p>}}
{{{content18}}}
</div>
}}{{#if: {{{heading19|}}}{{{content19|}}}<noinclude>demo</noinclude> |
<div class="tpl-navsidebar-content">
{{#if: {{{heading19|<noinclude>demo</noinclude>}}}|<p class="tpl-navsidebar-heading">{{{heading19}}}</p>}}
{{{content19}}}
</div>
}}<!-- end of contents --></div><!--
-->{{#if: {{{below|<noinclude>demo</noinclude>}}}|<div class="tpl-navsidebar-content">
{{{below}}}
</div>
}}{{#ifeq: {{{navbar|<noinclude>on</noinclude>}}}|off|| {{#if: {{{name|<noinclude>demo</noinclude>}}}|
<p class="tpl-navsidebar-foot">[<span class="noprint plainlinks"><!--
-->[{{fullurl:Template:{{{name|{{PAGENAME}}}}}|action=edit}} <span title="Edit this template">edit</span>]<!--
--></span>]</p>
}}}}</div><noinclude>{{documentation}}<!-- Please add categories and interwikis to the bottom of Template: Sidebar/doc, not here --></noinclude>
tsr1vnm14q6u5fbgyc4g0s8g41pnsvv
Tool:Gitlab-account-approval/Log
116
453906
2396606
2396229
2026-03-28T14:51:28Z
Gitlabaccountapprovalbot
37332
@janeeva1 was approved.
2396606
wikitext
text/x-wiki
<noinclude>'''Audit log of approvals''' made by [[gitlab:gitlabaccountapprovalbot|@gitlabaccountapprovalbot]]. __NOTOC__</noinclude>
=== 2026-03-28 ===
* 14:51 [[gitlab:janeeva1|@janeeva1]] was approved.
=== 2026-03-26 ===
* 13:36 [[gitlab:saiphani02|@saiphani02]] was approved.
* 11:48 [[gitlab:valerioboz-wmch|@valerioboz-wmch]] was approved.
=== 2026-03-25 ===
* 09:45 "quansi" was rejected (pending since 2025-12-24T09:42:13.451Z).
* 02:18 [[gitlab:viztor|@viztor]] was approved.
=== 2026-03-24 ===
* 23:18 [[gitlab:maryyann|@maryyann]] was approved.
* 23:01 [[gitlab:codenamenoreste|@codenamenoreste]] was approved.
* 13:36 [[gitlab:marc-maillard-wmse|@marc-maillard-wmse]] was approved.
* 07:39 "fred2675" was rejected (pending since 2025-12-23T07:39:11.380Z).
=== 2026-03-23 ===
* 14:51 [[gitlab:komla|@komla]] was approved.
* 05:51 "lunachuck43" was rejected (pending since 2025-12-22T05:50:17.862Z).
* 04:06 "reza110011" was rejected (pending since 2025-12-22T04:05:25.117Z).
=== 2026-03-20 ===
* 21:54 "mertgor" was rejected (pending since 2025-12-19T21:51:51.419Z).
* 20:57 "autanmahmah" was rejected (pending since 2025-12-19T20:54:51.678Z).
* 09:57 [[gitlab:nethahussain|@nethahussain]] was approved.
* 09:27 [[gitlab:piewriter|@piewriter]] was approved.
* 08:15 [[gitlab:dondersmooi|@dondersmooi]] was approved.
=== 2026-03-19 ===
* 21:03 "sayvhior" was rejected (pending since 2025-12-18T21:02:31.699Z).
=== 2026-03-18 ===
* 20:15 [[gitlab:martinmystere|@martinmystere]] was approved.
=== 2026-03-17 ===
* 02:51 "louperivois" was rejected (pending since 2025-12-16T02:50:48.197Z).
=== 2026-03-16 ===
* 12:54 "mokayaj857" was rejected (pending since 2025-12-15T12:53:39.015Z).
* 06:18 "roamer15" was rejected (pending since 2025-12-15T06:16:38.042Z).
=== 2026-03-14 ===
* 11:12 "umaramuhammad" was rejected (pending since 2025-12-13T11:10:44.004Z).
* 09:33 "akuma19" was rejected (pending since 2025-12-13T09:31:39.044Z).
* 07:06 [[gitlab:syunsyunminmin|@syunsyunminmin]] was approved.
=== 2026-03-12 ===
* 20:24 [[gitlab:11wb|@11wb]] was approved.
* 09:54 [[gitlab:bcxfu75k|@bcxfu75k]] was approved.
=== 2026-03-10 ===
* 09:12 [[gitlab:viktoriahillerudwmse|@viktoriahillerudwmse]] was approved.
=== 2026-03-06 ===
* 08:09 "vazhayilnewone" was rejected (pending since 2025-12-05T08:07:02.184Z).
=== 2026-03-04 ===
* 20:54 [[gitlab:elphie|@elphie]] was approved.
* 11:39 "ronaldahmed" was rejected (pending since 2025-12-03T11:37:47.492Z).
* 02:12 "ltslw" was rejected (pending since 2025-12-03T02:11:52.040Z).
=== 2026-03-02 ===
* 19:21 "dlopez350" was rejected (pending since 2025-12-01T19:20:38.918Z).
* 18:15 [[gitlab:lsandergreen|@lsandergreen]] was approved.
=== 2026-03-01 ===
* 10:51 [[gitlab:clintacc|@clintacc]] was approved.
=== 2026-02-28 ===
* 09:24 "cardboardlamp" was rejected (pending since 2025-11-29T09:22:03.947Z).
* 08:18 "wiki-pavan" was rejected (pending since 2025-11-29T08:16:24.184Z).
=== 2026-02-27 ===
* 20:45 "thisisrick25" was rejected (pending since 2025-11-28T20:42:24.454Z).
=== 2026-02-26 ===
* 13:57 "chuiimuiiofc" was rejected (pending since 2025-11-27T13:57:02.794Z).
* 13:54 "steffpro" was rejected (pending since 2025-11-27T13:52:10.859Z).
=== 2026-02-25 ===
* 21:24 "abubakarhabibudayyabu" was rejected (pending since 2025-11-26T21:22:37.776Z).
=== 2026-02-24 ===
* 05:00 "playboi" was rejected (pending since 2025-11-25T05:00:30.762Z).
=== 2026-02-23 ===
* 14:00 "alph65" was rejected (pending since 2025-11-24T13:59:00.797Z).
* 12:33 [[gitlab:robertsky|@robertsky]] was approved.
=== 2026-02-22 ===
* 00:30 "hp8p" was rejected (pending since 2025-11-23T00:29:24.741Z).
=== 2026-02-19 ===
* 16:45 "clayjar" was rejected (pending since 2025-11-20T16:44:48.380Z).
=== 2026-02-18 ===
* 22:18 "nexus" was rejected (pending since 2025-11-19T22:16:48.818Z).
* 12:00 "bernsteinnn" was rejected (pending since 2025-11-19T11:59:04.427Z).
=== 2026-02-17 ===
* 11:36 "jason2000-cpu" was rejected (pending since 2025-11-18T11:34:00.314Z).
=== 2026-02-16 ===
* 14:54 "smaurya" was rejected (pending since 2025-11-17T14:52:06.906Z).
=== 2026-02-15 ===
* 16:51 "kra-79" was rejected (pending since 2025-11-16T16:50:41.375Z).
=== 2026-02-14 ===
* 15:15 [[gitlab:mess|@mess]] was approved.
=== 2026-02-13 ===
* 13:57 "sopalsuemae957" was rejected (pending since 2025-11-14T13:55:16.921Z).
* 13:30 [[gitlab:wyslijp16-toolforge|@wyslijp16-toolforge]] was approved.
=== 2026-02-12 ===
* 16:30 "kristinagligoric" was rejected (pending since 2025-11-13T16:29:21.646Z).
* 03:33 [[gitlab:anyehansen|@anyehansen]] was approved.
* 02:21 [[gitlab:thejoyfultentmaker|@thejoyfultentmaker]] was approved.
=== 2026-02-10 ===
* 13:18 [[gitlab:db111|@db111]] was approved.
=== 2026-02-09 ===
* 19:06 "squirrel289" was rejected (pending since 2025-11-10T19:04:27.831Z).
=== 2026-02-06 ===
* 20:54 [[gitlab:gillux|@gillux]] was approved.
* 09:09 [[gitlab:lih|@lih]] was approved.
=== 2026-01-31 ===
* 16:21 [[gitlab:taxonbot1|@taxonbot1]] was approved.
=== 2026-01-28 ===
* 14:30 [[gitlab:ademola|@ademola]] was approved.
* 10:51 "watshell" was rejected (pending since 2025-10-29T10:51:01.521Z).
=== 2026-01-26 ===
* 23:06 "tavaresgmg" was rejected (pending since 2025-10-27T23:04:42.140Z).
=== 2026-01-25 ===
* 06:03 "cata" was rejected (pending since 2025-10-26T06:01:26.155Z).
=== 2026-01-24 ===
* 21:15 [[gitlab:wiegels|@wiegels]] was approved.
* 06:30 [[gitlab:blaquans|@blaquans]] was approved.
=== 2026-01-23 ===
* 16:27 [[gitlab:lerickson|@lerickson]] was approved.
* 10:15 "fran0035g" was rejected (pending since 2025-10-24T10:12:17.732Z).
=== 2026-01-22 ===
* 21:00 "hacksyn" was rejected (pending since 2025-10-23T20:59:15.982Z).
=== 2026-01-21 ===
* 17:30 [[gitlab:otcenas11|@otcenas11]] was approved.
=== 2026-01-19 ===
* 21:48 [[gitlab:amdrel|@amdrel]] was approved.
* 04:36 "rayalexa" was rejected (pending since 2025-10-20T04:35:02.094Z).
=== 2026-01-18 ===
* 15:45 "somya" was rejected (pending since 2025-10-19T15:43:43.701Z).
* 06:54 "sergg001" was rejected (pending since 2025-10-19T06:54:12.296Z).
=== 2026-01-16 ===
* 11:57 "zeejohsy" was rejected (pending since 2025-10-17T11:56:22.372Z).
* 04:45 "rocky25" was rejected (pending since 2025-10-17T04:43:33.180Z).
=== 2026-01-15 ===
* 16:39 "tiisu" was rejected (pending since 2025-10-16T16:37:18.438Z).
* 12:00 "noahalorwu" was rejected (pending since 2025-10-16T11:58:26.133Z).
* 10:39 "prjayaiuedu" was rejected (pending since 2025-10-16T10:37:16.947Z).
=== 2026-01-13 ===
* 17:21 [[gitlab:lwilson-ctr|@lwilson-ctr]] was approved.
=== 2026-01-12 ===
* 17:03 "stagietechs" was rejected (pending since 2025-10-13T17:02:25.281Z).
=== 2026-01-10 ===
* 19:06 "keerthisr" was rejected (pending since 2025-10-11T19:05:01.758Z).
=== 2026-01-09 ===
* 20:36 "lightb" was rejected (pending since 2025-10-10T20:34:20.264Z).
=== 2026-01-08 ===
* 19:42 [[gitlab:tbodt|@tbodt]] was approved.
* 13:57 [[gitlab:martynranyard|@martynranyard]] was approved.
=== 2026-01-07 ===
* 17:48 [[gitlab:santanuwiki25|@santanuwiki25]] was approved.
* 14:27 "dipanshu" was rejected (pending since 2025-10-08T14:26:10.794Z).
* 12:30 "adeolaadesina" was rejected (pending since 2025-10-08T12:29:49.592Z).
* 09:21 "tony-kamande" was rejected (pending since 2025-10-08T09:20:28.421Z).
* 06:18 "hninwuttyi" was rejected (pending since 2025-10-08T06:17:28.006Z).
* 05:09 "andume" was rejected (pending since 2025-10-08T05:07:18.582Z).
* 02:00 "mosope" was rejected (pending since 2025-10-08T01:59:54.800Z).
* 01:15 [[gitlab:tungstalite|@tungstalite]] was approved.
=== 2026-01-06 ===
* 18:24 "leerensucher" was rejected (pending since 2025-10-07T18:21:41.253Z).
* 14:54 "leonidlednev" was rejected (pending since 2025-10-07T14:53:07.273Z).
* 12:57 "alexandre-tingaud" was rejected (pending since 2025-10-07T12:54:27.206Z).
=== 2026-01-04 ===
* 21:33 [[gitlab:matr1x-101|@matr1x-101]] was approved.
* 15:18 "makjr" was rejected (pending since 2025-10-05T15:16:31.558Z).
* 14:09 "dakshq" was rejected (pending since 2025-10-05T14:08:40.608Z).
=== 2026-01-03 ===
* 20:42 [[gitlab:apehitkey|@apehitkey]] was approved.
* 18:00 [[gitlab:jeremyb|@jeremyb]] was approved.
* 14:09 [[gitlab:twelephant|@twelephant]] was approved.
=== 2026-01-01 ===
* 11:30 "shellstanislav" was rejected (pending since 2025-10-02T11:29:10.150Z).
=== 2025-12-30 ===
* 19:51 "camilojdiaz" was rejected (pending since 2025-09-30T19:49:24.913Z).
=== 2025-12-29 ===
* 16:03 "zied" was rejected (pending since 2025-09-29T16:01:30.415Z).
* 08:18 "rahulsidpradhan" was rejected (pending since 2025-09-29T08:17:02.849Z).
=== 2025-12-26 ===
* 09:48 "thembo42" was rejected (pending since 2025-09-26T09:45:15.033Z).
=== 2025-12-25 ===
* 14:03 "196936074751" was rejected (pending since 2025-09-25T14:02:31.367Z).
=== 2025-12-23 ===
* 16:21 "ngarnsworthy" was rejected (pending since 2025-09-23T16:20:41.211Z).
=== 2025-12-22 ===
* 12:39 "aza555" was rejected (pending since 2025-09-22T12:38:02.622Z).
=== 2025-12-20 ===
* 23:45 "saph" was rejected (pending since 2025-09-20T23:45:01.222Z).
=== 2025-12-19 ===
* 10:15 "vladdymoses" was rejected (pending since 2025-09-19T10:15:00.999Z).
* 07:15 "dirtylittlepoobah" was rejected (pending since 2025-09-19T07:13:55.537Z).
=== 2025-12-18 ===
* 16:24 [[gitlab:guyfawcus|@guyfawcus]] was approved.
=== 2025-12-17 ===
* 21:39 [[gitlab:holdyourhorses|@holdyourhorses]] was approved.
* 18:30 "prudencia" was rejected (pending since 2025-09-17T18:27:18.860Z).
* 02:24 "lottie" was rejected (pending since 2025-09-17T02:21:21.744Z).
=== 2025-12-16 ===
* 09:39 [[gitlab:melcatherine|@melcatherine]] was approved.
* 08:54 [[gitlab:leila237|@leila237]] was approved.
=== 2025-12-15 ===
* 18:27 [[gitlab:royalsailor|@royalsailor]] was approved.
* 09:39 [[gitlab:olaf8940|@olaf8940]] was approved.
* 09:39 "brianbybyby" was rejected (pending since 2025-09-15T09:37:45.430Z).
=== 2025-12-14 ===
* 20:21 [[gitlab:essa237|@essa237]] was approved.
* 16:42 [[gitlab:bovimacoco|@bovimacoco]] was approved.
=== 2025-12-13 ===
* 21:54 "mmns21" was rejected (pending since 2025-09-13T21:52:24.017Z).
* 20:33 "bugcrawler" was rejected (pending since 2025-09-13T20:31:09.211Z).
=== 2025-12-12 ===
* 14:39 "ruvchoudhary" was rejected (pending since 2025-09-12T14:36:16.167Z).
* 06:54 "rezadress" was rejected (pending since 2025-09-12T06:52:21.749Z).
=== 2025-12-10 ===
* 17:30 [[gitlab:itsmoon|@itsmoon]] was approved.
=== 2025-12-09 ===
* 15:42 [[gitlab:mercy-o|@mercy-o]] was approved.
=== 2025-12-06 ===
* 16:45 "jacquesradjabu" was rejected (pending since 2025-09-06T16:45:17.969Z).
* 11:27 [[gitlab:ikhitron|@ikhitron]] was approved.
=== 2025-12-01 ===
* 08:12 "halconmilenario21" was rejected (pending since 2025-09-01T08:12:10.262Z).
=== 2025-11-30 ===
* 21:06 [[gitlab:habs|@habs]] was approved.
=== 2025-11-29 ===
* 16:36 "bovimacoco" was rejected (pending since 2025-08-30T16:34:39.712Z).
* 00:45 [[gitlab:jjpmaster|@jjpmaster]] was approved.
=== 2025-11-24 ===
* 10:30 "alph65" was rejected (pending since 2025-08-25T10:28:40.957Z).
* 02:24 [[gitlab:yaron|@yaron]] was approved.
=== 2025-11-20 ===
* 16:06 "clayjar" was rejected (pending since 2025-08-21T16:04:54.450Z).
=== 2025-11-17 ===
* 21:09 [[gitlab:ankita97531|@ankita97531]] was approved.
=== 2025-11-16 ===
* 14:15 "commanderkefir" was rejected (pending since 2025-08-17T14:13:14.791Z).
* 08:21 "rehankhan78" was rejected (pending since 2025-08-17T08:19:44.896Z).
=== 2025-11-15 ===
* 14:36 "cyberscribe" was rejected (pending since 2025-08-16T14:34:27.230Z).
=== 2025-11-13 ===
* 04:21 "waddie96" was rejected (pending since 2025-08-14T04:19:27.461Z).
=== 2025-11-11 ===
* 06:42 [[gitlab:seanhoyland|@seanhoyland]] was approved.
=== 2025-11-10 ===
* 00:06 [[gitlab:jaredblumer|@jaredblumer]] was approved.
=== 2025-11-09 ===
* 22:36 "heinxiety" was rejected (pending since 2025-08-10T22:33:12.041Z).
=== 2025-11-07 ===
* 22:00 [[gitlab:forzagreen|@forzagreen]] was approved.
=== 2025-11-06 ===
* 16:57 [[gitlab:rsilvola|@rsilvola]] was approved.
=== 2025-11-04 ===
* 21:24 [[gitlab:devdoingdev|@devdoingdev]] was approved.
=== 2025-11-03 ===
* 17:48 "joewaleed98" was rejected (pending since 2025-08-04T17:46:12.191Z).
=== 2025-11-01 ===
* 18:00 "eliasempresas" was rejected (pending since 2025-08-02T17:58:04.412Z).
=== 2025-10-31 ===
* 18:51 [[gitlab:chaoticenby|@chaoticenby]] was approved.
* 04:33 "3ch310n" was rejected (pending since 2025-08-01T04:32:21.982Z).
=== 2025-10-30 ===
* 10:03 [[gitlab:tausheefhassan|@tausheefhassan]] was approved.
=== 2025-10-29 ===
* 14:54 "theap" was rejected (pending since 2025-07-30T14:52:12.066Z).
=== 2025-10-28 ===
* 06:06 [[gitlab:tanbiruzzaman|@tanbiruzzaman]] was approved.
=== 2025-10-27 ===
* 07:51 [[gitlab:jmoore111|@jmoore111]] was approved.
=== 2025-10-25 ===
* 21:09 [[gitlab:valor|@valor]] was approved.
* 21:03 [[gitlab:booksmurf|@booksmurf]] was approved.
* 02:48 "mystyc1" was rejected (pending since 2025-07-26T02:46:19.373Z).
=== 2025-10-24 ===
* 05:12 "aadarshmahesh" was rejected (pending since 2025-07-25T05:09:38.264Z).
=== 2025-10-22 ===
* 20:54 [[gitlab:janewanga|@janewanga]] was approved.
* 17:27 "abeljeevan" was rejected (pending since 2025-07-23T17:26:46.884Z).
* 16:12 "shrimpnaur" was rejected (pending since 2025-07-23T16:10:37.864Z).
=== 2025-10-21 ===
* 18:51 "jrmuizel" was rejected (pending since 2025-07-22T18:50:07.315Z).
* 09:33 [[gitlab:dpogorzelski|@dpogorzelski]] was approved.
=== 2025-10-17 ===
* 13:21 [[gitlab:blegodwin|@blegodwin]] was approved.
=== 2025-10-16 ===
* 14:51 [[gitlab:bahago|@bahago]] was approved.
* 14:12 "harikrishna0005" was rejected (pending since 2025-07-17T14:10:48.385Z).
* 14:09 "gauthammohanraj" was rejected (pending since 2025-07-17T14:08:47.643Z).
=== 2025-10-15 ===
* 13:48 [[gitlab:adwivedii|@adwivedii]] was approved.
* 13:18 [[gitlab:kimbrenekakande|@kimbrenekakande]] was approved.
* 13:03 "childmnajennifer" was rejected (pending since 2025-07-16T13:01:50.236Z).
* 05:06 "vssb4214" was rejected (pending since 2025-07-16T05:05:33.985Z).
=== 2025-10-14 ===
* 19:39 [[gitlab:afanyulionel|@afanyulionel]] was approved.
* 15:33 [[gitlab:sadrettin|@sadrettin]] was approved.
* 14:18 [[gitlab:tmwyk|@tmwyk]] was approved.
* 08:42 "yasu0796" was rejected (pending since 2025-07-15T08:41:26.453Z).
=== 2025-10-13 ===
* 16:09 [[gitlab:atlas0007|@atlas0007]] was approved.
=== 2025-10-11 ===
* 17:42 [[gitlab:techwizzie|@techwizzie]] was approved.
=== 2025-10-10 ===
* 19:03 [[gitlab:miiswom|@miiswom]] was approved.
* 16:06 [[gitlab:ninatakang|@ninatakang]] was approved.
=== 2025-10-09 ===
* 15:42 [[gitlab:jaykaneki|@jaykaneki]] was approved.
* 14:21 [[gitlab:lebogang|@lebogang]] was approved.
* 14:15 [[gitlab:kimondorose|@kimondorose]] was approved.
* 13:48 [[gitlab:joyakinyi|@joyakinyi]] was approved.
* 13:48 [[gitlab:dikshyashahi|@dikshyashahi]] was approved.
* 13:45 [[gitlab:obediobadiah|@obediobadiah]] was approved.
* 13:45 [[gitlab:system625|@system625]] was approved.
* 13:45 [[gitlab:rolalove|@rolalove]] was approved.
* 13:39 [[gitlab:olatundeawo|@olatundeawo]] was approved.
* 13:36 [[gitlab:danielchristlight|@danielchristlight]] was approved.
* 13:36 [[gitlab:dipanshu1223|@dipanshu1223]] was approved.
* 13:36 [[gitlab:aradhya|@aradhya]] was approved.
* 09:57 "bognd" was rejected (pending since 2025-07-10T09:55:48.661Z).
=== 2025-10-08 ===
* 23:36 [[gitlab:sopzy|@sopzy]] was approved.
* 23:03 [[gitlab:oluwatumininu|@oluwatumininu]] was approved.
* 19:39 [[gitlab:levon003|@levon003]] was approved.
* 15:24 [[gitlab:ritika-bhambri11|@ritika-bhambri11]] was approved.
* 13:45 [[gitlab:anbanguyen|@anbanguyen]] was approved.
* 13:36 [[gitlab:chumzine|@chumzine]] was approved.
* 13:27 [[gitlab:shr0x-ya|@shr0x-ya]] was approved.
* 12:45 [[gitlab:nurahwakili|@nurahwakili]] was approved.
* 03:42 "nazhiba" was rejected (pending since 2025-07-09T03:40:12.625Z).
* 02:12 "mafennel" was rejected (pending since 2025-07-09T02:11:40.598Z).
=== 2025-10-07 ===
* 22:54 [[gitlab:olusegunfaj|@olusegunfaj]] was approved.
* 21:30 [[gitlab:rona|@rona]] was approved.
* 21:09 [[gitlab:sandijigs|@sandijigs]] was approved.
* 13:36 "xisbajao" was rejected (pending since 2025-07-08T13:33:35.018Z).
* 01:36 "areczek94" was rejected (pending since 2025-07-08T01:35:40.633Z).
=== 2025-10-06 ===
* 19:21 "wmcarter2017" was rejected (pending since 2025-07-07T19:21:12.899Z).
=== 2025-10-05 ===
* 14:15 "meetmendapara" was rejected (pending since 2025-07-06T14:14:16.726Z).
=== 2025-10-04 ===
* 20:51 "nftbaee" was rejected (pending since 2025-07-05T20:50:57.688Z).
=== 2025-10-03 ===
* 06:12 [[gitlab:javiermonton|@javiermonton]] was approved.
=== 2025-10-02 ===
* 20:15 "talaqalotaibipmp" was rejected (pending since 2025-07-03T20:13:05.164Z).
=== 2025-10-01 ===
* 10:54 "bjensen" was rejected (pending since 2025-07-02T10:53:46.574Z).
* 02:45 "kowal1984" was rejected (pending since 2025-07-02T02:44:56.946Z).
=== 2025-09-30 ===
* 21:21 [[gitlab:kavaljeetsingh|@kavaljeetsingh]] was approved.
* 00:24 "adium" was rejected (pending since 2025-07-01T00:23:43.807Z).
=== 2025-09-28 ===
* 08:54 [[gitlab:pexerik|@pexerik]] was approved.
=== 2025-09-27 ===
* 13:57 [[gitlab:rubahhitamvukova|@rubahhitamvukova]] was approved.
=== 2025-09-26 ===
* 16:57 "algorithmic" was rejected (pending since 2025-06-27T16:56:17.480Z).
* 13:54 [[gitlab:shadabgdg|@shadabgdg]] was approved.
* 13:12 [[gitlab:spushpit|@spushpit]] was approved.
=== 2025-09-20 ===
* 14:06 "bwiki" was rejected (pending since 2025-06-21T13:59:14.749Z).
=== 2025-09-16 ===
* 05:39 [[gitlab:deepchirp|@deepchirp]] was approved.
=== 2025-09-15 ===
* 22:00 [[gitlab:noisk8|@noisk8]] was approved.
* 11:03 "ahonc" was rejected (pending since 2025-06-16T11:00:54.843Z).
=== 2025-09-13 ===
* 18:24 "a-ssh22" was rejected (pending since 2025-06-14T18:23:33.937Z).
* 12:36 [[gitlab:rajashreetalukdar|@rajashreetalukdar]] was approved.
* 00:45 [[gitlab:sumitsurai|@sumitsurai]] was approved.
=== 2025-09-12 ===
* 17:12 [[gitlab:suyash23|@suyash23]] was approved.
* 00:46 "remotetravel" was rejected (pending since 2025-06-13T00:44:08.171Z).
=== 2025-09-10 ===
* 21:09 "jancborchardt" was rejected (pending since 2025-06-11T21:06:30.759Z).
=== 2025-09-09 ===
* 17:03 [[gitlab:vwf|@vwf]] was approved.
* 06:36 [[gitlab:cactusisme|@cactusisme]] was approved.
=== 2025-09-08 ===
* 18:09 "birushandegeya" was rejected (pending since 2025-06-09T18:08:00.087Z).
* 16:27 "ngarnsworthy" was rejected (pending since 2025-06-09T16:24:37.213Z).
* 12:33 "zolgoyo" was rejected (pending since 2025-06-09T12:31:34.199Z).
=== 2025-09-06 ===
* 23:09 [[gitlab:jaishsingh913|@jaishsingh913]] was approved.
=== 2025-09-05 ===
* 21:45 [[gitlab:sakshi2|@sakshi2]] was approved.
* 20:42 "abdukhaliq1" was rejected (pending since 2025-06-06T20:40:42.023Z).
* 14:27 "beubsamy" was rejected (pending since 2025-06-06T14:27:06.781Z).
=== 2025-09-04 ===
* 23:27 "sdhehua" was rejected (pending since 2025-06-05T23:24:45.777Z).
* 19:00 [[gitlab:perry|@perry]] was approved.
* 11:24 "saintwolf" was rejected (pending since 2025-06-05T11:21:20.176Z).
=== 2025-09-02 ===
* 05:48 [[gitlab:aliu|@aliu]] was approved.
=== 2025-08-29 ===
* 13:30 "kksurendran066" was rejected (pending since 2025-05-30T13:27:48.755Z).
=== 2025-08-28 ===
* 22:18 "tauraamuix" was rejected (pending since 2025-05-29T22:16:08.228Z).
=== 2025-08-26 ===
* 19:03 [[gitlab:dikkulah|@dikkulah]] was approved.
=== 2025-08-22 ===
* 23:51 [[gitlab:khoroshun_mike|@khoroshun_mike]] was approved.
=== 2025-08-21 ===
* 07:39 [[gitlab:yuka|@yuka]] was approved.
=== 2025-08-19 ===
* 07:48 [[gitlab:zhaofjx|@zhaofjx]] was approved.
=== 2025-08-17 ===
* 14:27 "madhan13k" was rejected (pending since 2025-05-18T14:26:08.973Z).
=== 2025-08-15 ===
* 10:15 "mohammed_abukhadra" was rejected (pending since 2025-05-16T10:14:48.403Z).
=== 2025-08-11 ===
* 11:48 "hmmyesbro" was rejected (pending since 2025-05-12T11:45:24.350Z).
=== 2025-08-10 ===
* 13:15 [[gitlab:dactyl|@dactyl]] was approved.
=== 2025-08-09 ===
* 04:39 "xxxx100000" was rejected (pending since 2025-05-10T04:37:44.949Z).
=== 2025-08-08 ===
* 14:33 [[gitlab:josefanthony|@josefanthony]] was approved.
=== 2025-08-07 ===
* 23:42 [[gitlab:robins7|@robins7]] was approved.
* 21:42 [[gitlab:pols12|@pols12]] was approved.
* 17:15 "sbronson" was rejected (pending since 2025-05-08T17:15:08.834Z).
* 14:57 [[gitlab:alvindulle|@alvindulle]] was approved.
* 14:45 [[gitlab:xentos|@xentos]] was approved.
* 06:27 "jamesboste" was rejected (pending since 2025-05-08T06:25:14.793Z).
* 03:57 "ysun" was rejected (pending since 2025-05-08T03:55:07.348Z).
=== 2025-08-06 ===
* 21:51 "pols12" was rejected (pending since 2025-05-07T21:49:13.598Z).
* 01:51 "okeamah" was rejected (pending since 2025-05-07T01:48:50.114Z).
=== 2025-08-05 ===
* 09:15 "mobashir-2013" was rejected (pending since 2025-05-06T09:14:24.069Z).
=== 2025-08-01 ===
* 08:00 "douginamug" was rejected (pending since 2025-05-02T07:57:38.317Z).
=== 2025-07-31 ===
* 02:30 [[gitlab:ads|@ads]] was approved.
=== 2025-07-27 ===
* 13:15 "mrico2703" was rejected (pending since 2025-04-27T13:13:12.346Z).
* 10:17 [[gitlab:josephfrancis12|@josephfrancis12]] was approved.
* 10:17 [[gitlab:fuzzew|@fuzzew]] was approved.
* 05:57 [[gitlab:biscuitbobby|@biscuitbobby]] was approved.
* 05:48 [[gitlab:ecoholic|@ecoholic]] was approved.
=== 2025-07-26 ===
* 11:48 [[gitlab:chimnayyyy|@chimnayyyy]] was approved.
* 11:48 [[gitlab:alwinalbert|@alwinalbert]] was approved.
* 11:48 [[gitlab:hridyakk|@hridyakk]] was approved.
* 11:45 [[gitlab:gaurigupta21|@gaurigupta21]] was approved.
* 11:45 [[gitlab:binetaa|@binetaa]] was approved.
* 10:21 [[gitlab:jyothikat22|@jyothikat22]] was approved.
* 10:21 [[gitlab:zobotrombie|@zobotrombie]] was approved.
* 10:21 [[gitlab:flykrth|@flykrth]] was approved.
* 10:21 [[gitlab:mehrinshamim|@mehrinshamim]] was approved.
* 10:21 [[gitlab:aadhi13|@aadhi13]] was approved.
* 10:21 [[gitlab:malavikam05|@malavikam05]] was approved.
* 10:18 [[gitlab:nf609|@nf609]] was approved.
* 05:48 [[gitlab:nazalnihad|@nazalnihad]] was approved.
* 05:48 [[gitlab:naveen28204280|@naveen28204280]] was approved.
=== 2025-07-25 ===
* 09:49 [[gitlab:kasyap9|@kasyap9]] was approved.
* 09:30 [[gitlab:swayamagrahari|@swayamagrahari]] was approved.
=== 2025-07-24 ===
* 19:36 [[gitlab:madutgn|@madutgn]] was approved.
=== 2025-07-23 ===
* 20:09 [[gitlab:somerandomdeveloper|@somerandomdeveloper]] was approved.
=== 2025-07-22 ===
* 00:15 [[gitlab:iagoqnsi|@iagoqnsi]] was approved.
=== 2025-07-21 ===
* 17:30 [[gitlab:asadiqui|@asadiqui]] was approved.
* 16:39 [[gitlab:tryvix1509|@tryvix1509]] was approved.
* 04:27 [[gitlab:damian|@damian]] was approved.
=== 2025-07-20 ===
* 09:42 "mike-khoroshun" was rejected (pending since 2025-04-20T09:42:22.732Z).
=== 2025-07-17 ===
* 17:57 [[gitlab:haroldkrabs|@haroldkrabs]] was approved.
* 13:45 [[gitlab:envlh|@envlh]] was approved.
=== 2025-07-14 ===
* 10:24 [[gitlab:missguru|@missguru]] was approved.
* 00:57 "clarfonthey" was rejected (pending since 2025-04-14T00:56:32.626Z).
=== 2025-07-13 ===
* 01:01 [[gitlab:l235|@l235]] was approved.
=== 2025-07-11 ===
* 03:06 "rodavlas" was rejected (pending since 2025-04-11T03:05:45.590Z).
=== 2025-07-06 ===
* 00:09 "lakasa" was rejected (pending since 2025-04-06T00:06:28.469Z).
=== 2025-07-05 ===
* 21:54 "ctrlzvi" was rejected (pending since 2025-04-05T21:54:12.542Z).
* 14:30 "aminualiyu" was rejected (pending since 2025-04-05T14:27:22.617Z).
=== 2025-07-04 ===
* 03:15 [[gitlab:galstar|@galstar]] was approved.
=== 2025-07-02 ===
* 11:27 "vicolas11" was rejected (pending since 2025-04-02T11:25:12.682Z).
=== 2025-06-29 ===
* 23:12 "naomi723" was rejected (pending since 2025-03-30T23:09:24.630Z).
=== 2025-06-28 ===
* 16:21 "mudeh2372" was rejected (pending since 2025-03-29T16:18:27.057Z).
=== 2025-06-27 ===
* 23:18 "rony143" was rejected (pending since 2025-03-28T23:16:13.671Z).
* 22:21 [[gitlab:rluts|@rluts]] was approved.
=== 2025-06-26 ===
* 13:54 "creativegurus" was rejected (pending since 2025-03-27T13:52:41.706Z).
=== 2025-06-24 ===
* 17:42 [[gitlab:devjadiya|@devjadiya]] was approved.
* 14:00 "dominic-r" was rejected (pending since 2025-03-25T14:00:07.307Z).
=== 2025-06-21 ===
* 00:48 [[gitlab:vriaa|@vriaa]] was approved.
=== 2025-06-18 ===
* 15:21 "ayushkhati1" was rejected (pending since 2025-03-19T15:18:50.062Z).
=== 2025-06-17 ===
* 20:45 "chiomavero" was rejected (pending since 2025-03-18T20:44:13.967Z).
* 00:27 [[gitlab:eggroll97|@eggroll97]] was approved.
=== 2025-06-14 ===
* 20:57 "volvox" was rejected (pending since 2025-03-15T20:56:34.018Z).
=== 2025-06-13 ===
* 16:09 [[gitlab:supergrey|@supergrey]] was approved.
* 11:03 "chqaz" was rejected (pending since 2025-03-14T11:01:09.600Z).
* 10:24 [[gitlab:slong-wmf|@slong-wmf]] was approved.
* 10:15 "hearvox" was rejected (pending since 2025-03-14T10:13:13.112Z).
=== 2025-06-12 ===
* 15:18 "jlam" was rejected (pending since 2025-03-13T15:17:54.099Z).
=== 2025-06-09 ===
* 20:48 "dipanjansengupta" was rejected (pending since 2025-03-10T20:48:03.545Z).
* 19:27 [[gitlab:reggycelly|@reggycelly]] was approved.
* 14:51 "arendpieter" was rejected (pending since 2025-03-10T14:51:01.445Z).
* 13:21 [[gitlab:greenreaper|@greenreaper]] was approved.
* 09:33 [[gitlab:mmta|@mmta]] was approved.
* 08:03 "a-ssh22" was rejected (pending since 2025-03-10T08:03:08.111Z).
=== 2025-06-08 ===
* 21:06 "mm-episodenlistedlvaupdater" was rejected (pending since 2025-03-09T21:04:06.323Z).
=== 2025-06-06 ===
* 11:06 [[gitlab:olea|@olea]] was approved.
=== 2025-06-05 ===
* 20:33 [[gitlab:encodedwp|@encodedwp]] was approved.
* 15:00 [[gitlab:toluayo|@toluayo]] was approved.
* 13:51 [[gitlab:arnold_lup|@arnold_lup]] was approved.
* 11:54 "sdhehua" was rejected (pending since 2025-03-06T11:51:48.241Z).
=== 2025-06-03 ===
* 21:27 [[gitlab:wewakey|@wewakey]] was approved.
* 12:36 "hunsimon2" was rejected (pending since 2025-03-04T12:34:56.520Z).
* 11:54 "hunsimon" was rejected (pending since 2025-03-04T11:53:54.652Z).
=== 2025-06-02 ===
* 12:01 [[gitlab:jaimedes|@jaimedes]] was approved.
=== 2025-05-30 ===
* 18:00 "sathvik9105" was rejected (pending since 2025-02-28T17:59:42.867Z).
* 11:21 [[gitlab:tonythomas01|@tonythomas01]] was approved.
* 10:06 [[gitlab:gpsleo|@gpsleo]] was approved.
=== 2025-05-29 ===
* 22:12 [[gitlab:codynguyen1116|@codynguyen1116]] was approved.
=== 2025-05-28 ===
* 02:57 [[gitlab:saper|@saper]] was approved.
=== 2025-05-27 ===
* 21:06 [[gitlab:mohammed_qays|@mohammed_qays]] was approved.
* 15:33 "satanluimm" was rejected (pending since 2025-02-25T15:32:48.101Z).
=== 2025-05-26 ===
* 23:57 "seyedali220" was rejected (pending since 2025-02-24T23:56:17.621Z).
=== 2025-05-21 ===
* 11:12 [[gitlab:guilherme|@guilherme]] was approved.
=== 2025-05-19 ===
* 13:24 [[gitlab:emojiwiki|@emojiwiki]] was approved.
=== 2025-05-18 ===
* 00:00 "xidme" was rejected (pending since 2025-02-15T23:58:56.796Z).
=== 2025-05-17 ===
* 02:39 "kdh8219" was rejected (pending since 2025-02-15T02:36:32.237Z).
=== 2025-05-16 ===
* 15:09 [[gitlab:maxbinderwmf|@maxbinderwmf]] was approved.
=== 2025-05-15 ===
* 04:30 "inspectorzer0" was rejected (pending since 2025-02-13T04:27:33.179Z).
=== 2025-05-14 ===
* 17:42 [[gitlab:llugo|@llugo]] was approved.
=== 2025-05-13 ===
* 20:18 "mmta" was rejected (pending since 2025-02-11T20:17:23.407Z).
=== 2025-05-11 ===
* 20:51 "jad" was rejected (pending since 2025-02-09T20:49:07.333Z).
* 17:54 "nishchalsundan" was rejected (pending since 2025-02-09T17:52:25.761Z).
* 16:39 "mohammed_abukhadra" was rejected (pending since 2025-02-09T16:39:03.730Z).
=== 2025-05-09 ===
* 09:12 [[gitlab:sirchanmp|@sirchanmp]] was approved.
=== 2025-05-08 ===
* 08:18 [[gitlab:mengeditch|@mengeditch]] was approved.
=== 2025-05-07 ===
* 03:45 "xluffy" was rejected (pending since 2025-02-05T03:45:14.181Z).
=== 2025-05-06 ===
* 16:54 "punhaniabhishek" was rejected (pending since 2025-02-04T16:53:50.758Z).
* 09:36 [[gitlab:bmartinezcalvo|@bmartinezcalvo]] was approved.
=== 2025-05-02 ===
* 12:24 [[gitlab:tohaomg|@tohaomg]] was approved.
* 11:48 [[gitlab:mavrikant|@mavrikant]] was approved.
* 11:45 [[gitlab:daanvr|@daanvr]] was approved.
=== 2025-05-01 ===
* 09:09 "mjoerg" was rejected (pending since 2025-01-30T09:09:04.204Z).
=== 2025-04-30 ===
* 23:06 "sanskardubey" was rejected (pending since 2025-01-29T23:03:25.489Z).
=== 2025-04-29 ===
* 16:00 "geyslein" was rejected (pending since 2025-01-28T16:00:01.510Z).
=== 2025-04-26 ===
* 09:30 "anjali9027" was rejected (pending since 2025-01-25T09:28:07.064Z).
=== 2025-04-25 ===
* 18:00 "salahhazaa" was rejected (pending since 2025-01-24T17:58:30.030Z).
* 15:15 [[gitlab:yiming|@yiming]] was approved.
* 02:06 "mrchanmp" was rejected (pending since 2025-01-24T02:03:58.308Z).
=== 2025-04-23 ===
* 17:03 "rj2904" was rejected (pending since 2025-01-22T17:03:11.207Z).
* 14:21 "nischay33" was rejected (pending since 2025-01-22T14:19:21.081Z).
=== 2025-04-22 ===
* 19:27 "dj80" was rejected (pending since 2025-01-21T19:25:28.498Z).
* 14:30 [[gitlab:kaimamin|@kaimamin]] was approved.
* 09:57 "debo" was rejected (pending since 2025-01-21T09:54:47.955Z).
=== 2025-04-21 ===
* 12:24 "unshell" was rejected (pending since 2025-01-20T12:21:59.686Z).
=== 2025-04-18 ===
* 15:06 [[gitlab:spartanarbinger|@spartanarbinger]] was approved.
=== 2025-04-16 ===
* 03:09 "dewey" was rejected (pending since 2025-01-15T03:06:17.488Z).
=== 2025-04-15 ===
* 19:45 "emdadul" was rejected (pending since 2025-01-14T19:42:29.285Z).
=== 2025-04-14 ===
* 06:45 [[gitlab:bcampbell804|@bcampbell804]] was approved.
=== 2025-04-11 ===
* 06:27 [[gitlab:jvanderhoop|@jvanderhoop]] was approved.
=== 2025-04-10 ===
* 04:12 "bhai420" was rejected (pending since 2025-01-09T04:10:29.430Z).
=== 2025-04-09 ===
* 05:03 "austinvarshney" was rejected (pending since 2025-01-08T05:02:34.175Z).
=== 2025-04-06 ===
* 15:36 [[gitlab:elph|@elph]] was approved.
=== 2025-04-02 ===
* 10:33 [[gitlab:ozge|@ozge]] was approved.
=== 2025-03-31 ===
* 20:15 "demandkey" was rejected (pending since 2024-12-30T20:14:23.096Z).
* 15:18 [[gitlab:danyya|@danyya]] was approved.
=== 2025-03-28 ===
* 15:54 [[gitlab:rutsavi09|@rutsavi09]] was approved.
* 15:54 [[gitlab:ilanen1|@ilanen1]] was approved.
=== 2025-03-25 ===
* 19:27 [[gitlab:irfo|@irfo]] was approved.
* 11:54 [[gitlab:kmontalva-wmf|@kmontalva-wmf]] was approved.
* 04:33 [[gitlab:paul26|@paul26]] was approved.
* 04:18 "as1100k" was rejected (pending since 2024-12-24T04:18:06.813Z).
=== 2025-03-24 ===
* 11:33 "amzadkhankk" was rejected (pending since 2024-12-23T11:33:14.176Z).
=== 2025-03-23 ===
* 12:24 "wolfdo" was rejected (pending since 2024-12-22T12:23:35.056Z).
=== 2025-03-22 ===
* 09:45 [[gitlab:fjmustak|@fjmustak]] was approved.
=== 2025-03-20 ===
* 18:42 "sathishkokila" was rejected (pending since 2024-12-19T18:39:35.161Z).
* 17:03 [[gitlab:alien4444|@alien4444]] was approved.
* 15:27 [[gitlab:davidcoronel|@davidcoronel]] was approved.
=== 2025-03-19 ===
* 22:57 [[gitlab:r1f4t|@r1f4t]] was approved.
* 19:03 "daniel24ps" was rejected (pending since 2024-12-18T19:00:21.249Z).
* 14:18 [[gitlab:beepbooppenguin|@beepbooppenguin]] was approved.
=== 2025-03-18 ===
* 17:48 "rahulkundu1209" was rejected (pending since 2024-12-17T17:46:41.936Z).
* 08:15 "kirtisikka972" was rejected (pending since 2024-12-17T08:13:25.487Z).
=== 2025-03-15 ===
* 13:30 "tulspal_sidhu" was rejected (pending since 2024-12-14T13:29:10.606Z).
* 01:39 "peacedeadc" was rejected (pending since 2024-12-14T01:37:36.579Z).
=== 2025-03-14 ===
* 03:51 [[gitlab:chuckthebuck|@chuckthebuck]] was approved.
* 02:33 "yxngtrtxll" was rejected (pending since 2024-12-13T02:31:51.658Z).
=== 2025-03-13 ===
* 14:36 [[gitlab:iccander|@iccander]] was approved.
=== 2025-03-12 ===
* 23:21 "jokerchic36" was rejected (pending since 2024-12-11T23:21:00.670Z).
* 15:30 [[gitlab:naomi|@naomi]] was approved.
* 15:27 [[gitlab:cobi|@cobi]] was approved.
=== 2025-03-11 ===
* 12:42 "mohitvermaxx" was rejected (pending since 2024-12-10T12:40:56.967Z).
=== 2025-03-10 ===
* 16:51 [[gitlab:nanona15dobato|@nanona15dobato]] was approved.
=== 2025-03-09 ===
* 22:39 [[gitlab:jonkolbert|@jonkolbert]] was approved.
* 20:45 [[gitlab:urbanecmtest2|@urbanecmtest2]] was approved.
=== 2025-03-07 ===
* 16:54 [[gitlab:hswan|@hswan]] was approved.
* 14:42 [[gitlab:atitkov|@atitkov]] was approved.
* 00:42 [[gitlab:infrastruktur|@infrastruktur]] was approved.
=== 2025-03-06 ===
* 17:21 "johnmann" was rejected (pending since 2024-12-05T17:19:24.995Z).
=== 2025-03-05 ===
* 07:33 [[gitlab:monx9494|@monx9494]] was approved.
=== 2025-03-02 ===
* 21:21 "paul26" was rejected (pending since 2024-12-01T21:20:19.681Z).
=== 2025-03-01 ===
* 19:15 [[gitlab:izno|@izno]] was approved.
* 12:45 [[gitlab:nyerho|@nyerho]] was approved.
=== 2025-02-28 ===
* 18:27 [[gitlab:chuckonwumelu|@chuckonwumelu]] was approved.
* 13:09 "ashwinpraveengo" was rejected (pending since 2024-11-29T13:07:47.240Z).
* 00:18 "eduardoaugusto" was rejected (pending since 2024-11-29T00:17:43.372Z).
=== 2025-02-27 ===
* 20:39 "volkanurl" was rejected (pending since 2024-11-28T20:37:18.101Z).
=== 2025-02-24 ===
* 21:15 [[gitlab:feeglgeef|@feeglgeef]] was approved.
* 20:18 [[gitlab:piaanalysis2|@piaanalysis2]] was approved.
* 19:06 [[gitlab:dhardy|@dhardy]] was approved.
=== 2025-02-22 ===
* 19:27 [[gitlab:owuh|@owuh]] was approved.
=== 2025-02-19 ===
* 16:06 [[gitlab:artemkloko|@artemkloko]] was approved.
* 13:03 [[gitlab:jgafnea|@jgafnea]] was approved.
=== 2025-02-17 ===
* 16:33 [[gitlab:asmartkitten|@asmartkitten]] was approved.
=== 2025-02-16 ===
* 19:12 "gaurigupta21" was rejected (pending since 2024-11-17T19:11:07.416Z).
=== 2025-02-15 ===
* 01:18 [[gitlab:mediawiki-quickstart-ci|@mediawiki-quickstart-ci]] was approved.
=== 2025-02-14 ===
* 15:21 "nathanbnm" was rejected (pending since 2024-11-15T15:18:19.632Z).
=== 2025-02-13 ===
* 16:45 [[gitlab:priyanshuchahal|@priyanshuchahal]] was approved.
* 16:42 [[gitlab:ajhalili2006|@ajhalili2006]] was approved.
=== 2025-02-12 ===
* 23:21 "monkeypatch999" was rejected (pending since 2024-11-13T23:20:38.398Z).
* 06:36 [[gitlab:jainlakshita28|@jainlakshita28]] was approved.
=== 2025-02-11 ===
* 19:27 [[gitlab:matthewsm2|@matthewsm2]] was approved.
=== 2025-02-09 ===
* 16:15 "mohammed_abukhadra" was rejected (pending since 2024-11-10T16:15:18.361Z).
=== 2025-02-07 ===
* 21:33 "brennan" was rejected (pending since 2024-11-08T21:31:07.351Z).
=== 2025-02-06 ===
* 08:24 "mmta" was rejected (pending since 2024-11-07T08:22:36.724Z).
* 06:21 [[gitlab:bunnypranav|@bunnypranav]] was approved.
=== 2025-02-05 ===
* 22:39 "chrissteinchen" was rejected (pending since 2024-11-06T22:38:16.673Z).
=== 2025-02-03 ===
* 07:45 "edriiic" was rejected (pending since 2024-11-04T07:44:46.849Z).
* 01:12 "geppy" was rejected (pending since 2024-11-04T01:10:48.710Z).
=== 2025-02-02 ===
* 13:18 "funa-enpitu" was rejected (pending since 2024-11-03T13:15:46.065Z).
=== 2025-01-31 ===
* 23:42 "nfontes" was rejected (pending since 2024-11-01T23:39:41.755Z).
* 22:51 "sbronson" was rejected (pending since 2024-11-01T22:50:31.871Z).
* 00:42 [[gitlab:farid|@farid]] was approved.
=== 2025-01-27 ===
* 08:15 [[gitlab:eliza189|@eliza189]] was approved.
=== 2025-01-25 ===
* 09:51 [[gitlab:pamputt|@pamputt]] was approved.
=== 2025-01-23 ===
* 14:30 [[gitlab:lubianat|@lubianat]] was approved.
* 11:45 [[gitlab:bootsa|@bootsa]] was approved.
=== 2025-01-21 ===
* 05:09 "niko" was rejected (pending since 2024-07-21T16:10:01.377Z).
* 05:09 "thawizkid369777" was rejected (pending since 2024-07-18T17:42:44.493Z).
* 05:09 "sarthaksingh2" was rejected (pending since 2024-07-10T11:31:30.470Z).
* 05:09 "shriyakt" was rejected (pending since 2024-07-06T04:54:10.248Z).
* 05:09 "akshaya" was rejected (pending since 2024-07-06T04:04:51.488Z).
* 05:09 "alaka03aj" was rejected (pending since 2024-07-05T18:01:54.876Z).
* 05:09 "sulochanaviji-5049" was rejected (pending since 2024-07-01T05:58:00.427Z).
* 05:09 "nayanjnath" was rejected (pending since 2024-07-01T02:51:57.405Z).
* 05:09 "sd44" was rejected (pending since 2024-06-30T04:28:51.436Z).
* 05:09 "metavalent" was rejected (pending since 2024-06-29T01:37:14.210Z).
* 05:09 "wicloudx" was rejected (pending since 2024-06-28T11:51:23.335Z).
* 05:09 "debo" was rejected (pending since 2024-06-28T01:44:59.845Z).
* 05:09 "bwiki" was rejected (pending since 2024-06-23T14:15:38.032Z).
* 05:09 "toprak" was rejected (pending since 2024-06-23T11:35:50.819Z).
* 05:09 "iristeller" was rejected (pending since 2024-06-14T20:53:48.959Z).
* 05:09 "jcolvin" was rejected (pending since 2024-06-12T17:29:01.238Z).
* 05:09 "kalyan" was rejected (pending since 2024-06-07T07:52:46.993Z).
* 05:09 "bluecrystal" was rejected (pending since 2024-06-06T19:16:20.107Z).
* 05:09 "iftttrohit" was rejected (pending since 2024-06-04T12:08:50.818Z).
* 05:09 "pogpotato" was rejected (pending since 2024-06-03T17:58:21.684Z).
* 05:09 "cptlausebaer" was rejected (pending since 2024-05-31T18:53:27.692Z).
* 05:09 "hdevine825" was rejected (pending since 2024-05-31T17:04:18.279Z).
* 05:09 "anaghaa18" was rejected (pending since 2024-05-25T19:14:31.803Z).
* 05:09 "atharvanair04" was rejected (pending since 2024-05-25T14:24:52.825Z).
* 05:09 "anasvemmully" was rejected (pending since 2024-05-25T06:10:27.261Z).
* 05:09 "abhinavmohandas" was rejected (pending since 2024-05-25T06:05:24.825Z).
* 05:09 "kksurendran06" was rejected (pending since 2024-05-25T06:04:38.082Z).
* 05:09 "albertmarshall8896" was rejected (pending since 2024-05-23T09:32:05.462Z).
* 05:09 "akellison" was rejected (pending since 2024-05-17T02:07:24.229Z).
* 05:09 "mainowill" was rejected (pending since 2024-04-16T23:30:33.881Z).
* 05:09 "bzhqc" was rejected (pending since 2024-04-16T19:50:38.676Z).
* 05:09 "safan41" was rejected (pending since 2024-04-16T03:34:48.942Z).
* 05:09 "mgagat" was rejected (pending since 2024-04-16T03:21:51.764Z).
* 05:09 "okeamah" was rejected (pending since 2024-04-16T02:49:00.143Z).
* 05:09 "xuhao61" was rejected (pending since 2024-04-15T23:45:09.083Z).
* 04:47 "cybel" was rejected (pending since 2024-04-15T06:46:35.791Z).
=== 2025-01-20 ===
* 14:33 [[gitlab:your1|@your1]] was approved.
=== 2025-01-18 ===
* 10:09 [[gitlab:galrach600|@galrach600]] was approved.
* 02:51 [[gitlab:blankeclair|@blankeclair]] was approved.
=== 2025-01-17 ===
* 13:57 [[gitlab:dsantamaria|@dsantamaria]] was approved.
=== 2025-01-15 ===
* 17:12 [[gitlab:smartse|@smartse]] was approved.
=== 2025-01-14 ===
* 17:03 [[gitlab:naorleizer|@naorleizer]] was approved.
=== 2025-01-13 ===
* 02:45 [[gitlab:wolf20482|@wolf20482]] was approved.
=== 2025-01-12 ===
* 17:45 [[gitlab:tamzin|@tamzin]] was approved.
=== 2025-01-11 ===
* 15:24 [[gitlab:bargioni|@bargioni]] was approved.
* 14:30 [[gitlab:salelya|@salelya]] was approved.
* 10:15 [[gitlab:malakatshy|@malakatshy]] was approved.
* 05:21 [[gitlab:newmcpee|@newmcpee]] was approved.
=== 2025-01-09 ===
* 15:30 [[gitlab:gkyziridis|@gkyziridis]] was approved.
=== 2025-01-08 ===
* 16:21 [[gitlab:ukrface|@ukrface]] was approved.
=== 2024-12-28 ===
* 03:27 [[gitlab:twonum|@twonum]] was approved.
=== 2024-12-25 ===
* 06:09 [[gitlab:harsv567|@harsv567]] was approved.
=== 2024-12-21 ===
* 11:24 [[gitlab:amutha2002|@amutha2002]] was approved.
=== 2024-12-20 ===
* 19:51 [[gitlab:hridyeshgupta|@hridyeshgupta]] was approved.
* 10:00 [[gitlab:ro-shines|@ro-shines]] was approved.
* 08:09 [[gitlab:kesharwaniarpita|@kesharwaniarpita]] was approved.
=== 2024-12-18 ===
* 14:45 [[gitlab:soylacarli|@soylacarli]] was approved.
=== 2024-12-16 ===
* 20:33 [[gitlab:aleyasiddika1|@aleyasiddika1]] was approved.
=== 2024-12-15 ===
* 07:33 [[gitlab:abhishek02bhardwaj|@abhishek02bhardwaj]] was approved.
=== 2024-12-13 ===
* 13:18 [[gitlab:ashmitabathre204|@ashmitabathre204]] was approved.
=== 2024-12-10 ===
* 06:39 [[gitlab:ginaan|@ginaan]] was approved.
=== 2024-12-09 ===
* 05:45 [[gitlab:kallinavya|@kallinavya]] was approved.
* 00:54 [[gitlab:viserion-7|@viserion-7]] was approved.
=== 2024-12-08 ===
* 17:27 [[gitlab:wargo|@wargo]] was approved.
=== 2024-12-05 ===
* 11:15 [[gitlab:ranjithraj|@ranjithraj]] was approved.
=== 2024-12-02 ===
* 21:21 [[gitlab:a930913|@a930913]] was approved.
=== 2024-12-01 ===
* 02:39 [[gitlab:kingchristlike1|@kingchristlike1]] was approved.
=== 2024-11-21 ===
* 13:45 [[gitlab:sascha|@sascha]] was approved.
=== 2024-11-19 ===
* 16:36 [[gitlab:jly|@jly]] was approved.
=== 2024-11-15 ===
* 02:54 [[gitlab:danielyepezgarces|@danielyepezgarces]] was approved.
=== 2024-11-14 ===
* 14:15 [[gitlab:stimoroll|@stimoroll]] was approved.
=== 2024-11-09 ===
* 17:15 [[gitlab:f4udeveloper|@f4udeveloper]] was approved.
=== 2024-11-07 ===
* 19:15 [[gitlab:zulf|@zulf]] was approved.
* 05:33 [[gitlab:hassanamin|@hassanamin]] was approved.
=== 2024-11-06 ===
* 19:39 [[gitlab:daniuu|@daniuu]] was approved.
* 00:18 [[gitlab:rlopez-wmf|@rlopez-wmf]] was approved.
=== 2024-10-09 ===
* 14:45 [[gitlab:jtweed|@jtweed]] was approved.
* 10:24 [[gitlab:ifrahkh|@ifrahkh]] was approved.
* 09:06 [[gitlab:wikibayer|@wikibayer]] was approved.
=== 2024-10-06 ===
* 10:27 [[gitlab:keerthan16|@keerthan16]] was approved.
=== 2024-10-04 ===
* 07:45 [[gitlab:hakimi97|@hakimi97]] was approved.
=== 2024-09-30 ===
* 07:39 [[gitlab:ninjastrikers|@ninjastrikers]] was approved.
=== 2024-09-28 ===
* 17:30 [[gitlab:webrunner95|@webrunner95]] was approved.
=== 2024-09-18 ===
* 21:39 [[gitlab:elliottetzkorn|@elliottetzkorn]] was approved.
=== 2024-09-14 ===
* 22:06 [[gitlab:humptydumpty|@humptydumpty]] was approved.
=== 2024-09-06 ===
* 08:48 [[gitlab:mickabarber|@mickabarber]] was approved.
=== 2024-08-27 ===
* 17:36 [[gitlab:edgars|@edgars]] was approved.
=== 2024-08-22 ===
* 09:18 [[gitlab:antonkokhwmde|@antonkokhwmde]] was approved.
=== 2024-08-14 ===
* 19:21 [[gitlab:jfk|@jfk]] was approved.
=== 2024-08-13 ===
* 17:57 [[gitlab:daxserver|@daxserver]] was approved.
=== 2024-08-11 ===
* 09:57 [[gitlab:pauliesnug|@pauliesnug]] was approved.
=== 2024-08-10 ===
* 08:42 [[gitlab:ashig|@ashig]] was approved.
=== 2024-08-09 ===
* 14:09 [[gitlab:masssly|@masssly]] was approved.
=== 2024-08-05 ===
* 22:15 [[gitlab:mrtortue|@mrtortue]] was approved.
=== 2024-08-02 ===
* 16:21 [[gitlab:dsantini|@dsantini]] was approved.
=== 2024-07-31 ===
* 11:54 [[gitlab:cptviraj|@cptviraj]] was approved.
=== 2024-07-30 ===
* 19:09 [[gitlab:iniquity|@iniquity]] was approved.
* 10:00 [[gitlab:collins|@collins]] was approved.
=== 2024-07-27 ===
* 15:57 [[gitlab:songnguxyz|@songnguxyz]] was approved.
=== 2024-07-25 ===
* 12:36 [[gitlab:mszabo|@mszabo]] was approved.
* 09:21 [[gitlab:agarwalmahima|@agarwalmahima]] was approved.
=== 2024-07-24 ===
* 08:05 [[gitlab:dragoniez|@dragoniez]] was approved.
=== 2024-07-23 ===
* 06:54 [[gitlab:mirji|@mirji]] was approved.
=== 2024-07-16 ===
* 10:00 [[gitlab:lakejason0|@lakejason0]] was approved.
=== 2024-07-12 ===
* 11:33 [[gitlab:cn|@cn]] was approved.
* 08:12 [[gitlab:unchampignon|@unchampignon]] was approved.
=== 2024-07-07 ===
* 17:12 [[gitlab:agamyasamuel|@agamyasamuel]] was approved.
* 05:24 [[gitlab:kuldeepburjbhalaike|@kuldeepburjbhalaike]] was approved.
=== 2024-07-06 ===
* 11:18 [[gitlab:dibya|@dibya]] was approved.
* 04:54 [[gitlab:sarthakparashar|@sarthakparashar]] was approved.
=== 2024-07-05 ===
* 18:15 [[gitlab:vanshikarathi|@vanshikarathi]] was approved.
=== 2024-07-02 ===
* 19:00 [[gitlab:ebrahim|@ebrahim]] was approved.
=== 2024-07-01 ===
* 20:12 [[gitlab:rockingpenny4|@rockingpenny4]] was approved.
* 18:15 [[gitlab:balajijagadesh|@balajijagadesh]] was approved.
=== 2024-06-30 ===
* 18:24 [[gitlab:hrideshmg|@hrideshmg]] was approved.
* 07:18 [[gitlab:chanakyakumardas|@chanakyakumardas]] was approved.
* 06:30 [[gitlab:rihaan180|@rihaan180]] was approved.
=== 2024-06-27 ===
* 17:36 [[gitlab:driedmueller|@driedmueller]] was approved.
=== 2024-06-19 ===
* 12:57 [[gitlab:audreypenven|@audreypenven]] was approved.
=== 2024-06-16 ===
* 01:18 [[gitlab:roysmith|@roysmith]] was approved.
=== 2024-06-08 ===
* 02:45 [[gitlab:jleedev|@jleedev]] was approved.
=== 2024-06-03 ===
* 13:57 [[gitlab:afeder|@afeder]] was approved.
=== 2024-06-01 ===
* 10:54 [[gitlab:florianschmitt|@florianschmitt]] was approved.
=== 2024-05-30 ===
* 16:42 [[gitlab:krlsca|@krlsca]] was approved.
=== 2024-05-28 ===
* 11:24 [[gitlab:rickijay|@rickijay]] was approved.
=== 2024-05-26 ===
* 11:18 [[gitlab:ranjithsiji|@ranjithsiji]] was approved.
=== 2024-05-25 ===
* 07:24 [[gitlab:jony|@jony]] was approved.
=== 2024-05-23 ===
* 08:45 [[gitlab:lepticed7|@lepticed7]] was approved.
=== 2024-05-22 ===
* 20:42 [[gitlab:echecs|@echecs]] was approved.
=== 2024-05-21 ===
* 13:33 [[gitlab:mbs|@mbs]] was approved.
=== 2024-05-19 ===
* 18:06 [[gitlab:ionenlaser|@ionenlaser]] was approved.
=== 2024-05-18 ===
* 23:36 [[gitlab:mdaniels5757|@mdaniels5757]] was approved.
=== 2024-05-17 ===
* 08:54 [[gitlab:grapedog|@grapedog]] was approved.
=== 2024-05-08 ===
* 19:42 [[gitlab:kelhurd|@kelhurd]] was approved.
* 19:06 [[gitlab:khurd|@khurd]] was approved.
=== 2024-05-06 ===
* 19:48 [[gitlab:j3j5|@j3j5]] was approved.
* 12:06 [[gitlab:tk-999|@tk-999]] was approved.
=== 2024-05-05 ===
* 22:09 [[gitlab:pppery|@pppery]] was approved.
* 20:33 [[gitlab:sakretsu|@sakretsu]] was approved.
* 12:12 [[gitlab:waterquark|@waterquark]] was approved.
=== 2024-05-04 ===
* 09:03 [[gitlab:multichill|@multichill]] was approved.
* 07:42 [[gitlab:abaris|@abaris]] was approved.
=== 2024-05-03 ===
* 14:57 [[gitlab:maurusian|@maurusian]] was approved.
=== 2024-04-24 ===
* 05:48 [[gitlab:wolfinux|@wolfinux]] was approved.
=== 2024-04-23 ===
* 15:48 [[gitlab:dreamrimmer|@dreamrimmer]] was approved.
=== 2024-04-21 ===
* 06:51 [[gitlab:alon|@alon]] was approved.
=== 2024-04-17 ===
* 23:33 [[gitlab:derenrich|@derenrich]] was approved.
=== 2024-04-16 ===
* 17:18 [[gitlab:valcio|@valcio]] was approved.
=== 2024-04-14 ===
* 16:51 [[gitlab:wikilucas00|@wikilucas00]] was approved.
=== 2024-04-06 ===
* 12:48 [[gitlab:theprotonade|@theprotonade]] was approved.
=== 2024-04-02 ===
* 07:30 [[gitlab:bohuizhang|@bohuizhang]] was approved.
=== 2024-03-30 ===
* 13:36 [[gitlab:lpintscher|@lpintscher]] was approved.
=== 2024-03-26 ===
* 17:09 [[gitlab:eenabulele|@eenabulele]] was approved.
=== 2024-03-25 ===
* 14:27 [[gitlab:tuukka|@tuukka]] was approved.
=== 2024-03-24 ===
* 12:24 [[gitlab:firefly|@firefly]] was approved.
=== 2024-03-21 ===
* 19:33 [[gitlab:universal-omega|@universal-omega]] was approved.
=== 2024-03-17 ===
* 10:36 [[gitlab:bisel91|@bisel91]] was approved.
=== 2024-03-16 ===
* 10:09 [[gitlab:delord|@delord]] was approved.
* 00:42 [[gitlab:athulvis1|@athulvis1]] was approved.
=== 2024-03-15 ===
* 19:06 [[gitlab:ignaciorodrguez|@ignaciorodrguez]] was approved.
* 08:30 [[gitlab:peachey88|@peachey88]] was approved.
* 06:51 [[gitlab:derick|@derick]] was approved.
=== 2024-03-12 ===
* 15:06 [[gitlab:xiaoxiao|@xiaoxiao]] was approved.
=== 2024-03-06 ===
* 13:21 [[gitlab:desianabae1|@desianabae1]] was approved.
=== 2024-03-05 ===
* 19:21 [[gitlab:ep1c|@ep1c]] was approved.
* 16:33 [[gitlab:jasmine|@jasmine]] was approved.
=== 2024-03-02 ===
* 06:42 [[gitlab:potsdamlamb|@potsdamlamb]] was approved.
=== 2024-02-29 ===
* 23:18 [[gitlab:arandomname123|@arandomname123]] was approved.
* 18:03 [[gitlab:baba|@baba]] was approved.
* 17:48 [[gitlab:yfdyh000|@yfdyh000]] was approved.
* 03:09 [[gitlab:sds|@sds]] was approved.
=== 2024-02-27 ===
* 23:33 [[gitlab:lofhi|@lofhi]] was approved.
=== 2024-02-15 ===
* 19:45 [[gitlab:gergesshamon|@gergesshamon]] was approved.
=== 2024-02-14 ===
* 14:33 [[gitlab:philipnelson99|@philipnelson99]] was approved.
=== 2024-02-13 ===
* 13:06 [[gitlab:dringsim|@dringsim]] was approved.
=== 2024-02-12 ===
* 17:36 [[gitlab:haak|@haak]] was approved.
=== 2024-02-05 ===
* 17:33 [[gitlab:qwerfjkl|@qwerfjkl]] was approved.
* 17:14 [[gitlab:ahecht|@ahecht]] was approved.
=== 2024-02-01 ===
* 09:27 [[gitlab:arinaigum|@arinaigum]] was approved.
* 00:15 [[gitlab:jas42|@jas42]] was approved.
* 00:15 [[gitlab:edhu|@edhu]] was approved.
* 00:15 [[gitlab:marnanel|@marnanel]] was approved.
* 00:15 [[gitlab:ibrahemqasim|@ibrahemqasim]] was approved.
* 00:15 [[gitlab:amasotti|@amasotti]] was approved.
* 00:15 [[gitlab:deni|@deni]] was approved.
* 00:15 [[gitlab:cyber|@cyber]] was approved.
* 00:15 [[gitlab:saroj|@saroj]] was approved.
=== 2024-01-29 ===
* 21:42 [[gitlab:rgupta|@rgupta]] was approved.
=== 2024-01-07 ===
* 09:48 [[gitlab:lutrome|@lutrome]] was approved.
=== 2024-01-05 ===
* 20:48 [[gitlab:jinoytommanjaly|@jinoytommanjaly]] was approved.
* 02:51 [[gitlab:braunobruno|@braunobruno]] was approved.
* 01:08 [[gitlab:amorymeltzer|@amorymeltzer]] was approved.
* 01:08 [[gitlab:phi22ipus|@phi22ipus]] was approved.
=== 2024-01-03 ===
* 14:45 [[gitlab:gabina|@gabina]] was approved.
=== 2024-01-02 ===
* 13:18 [[gitlab:arthurtaylor|@arthurtaylor]] was approved.
=== 2023-12-23 ===
* 00:33 [[gitlab:aram|@aram]] was approved.
=== 2023-12-22 ===
* 16:24 [[gitlab:elpitareio|@elpitareio]] was approved.
=== 2023-12-21 ===
* 00:43 [[gitlab:bsadowski1|@bsadowski1]] was approved.
* 00:43 [[gitlab:ederporto|@ederporto]] was approved.
* 00:43 [[gitlab:sadraiiali|@sadraiiali]] was approved.
* 00:43 [[gitlab:wasp-outis|@wasp-outis]] was approved.
* 00:43 [[gitlab:bodhisattwa|@bodhisattwa]] was approved.
* 00:43 [[gitlab:air7538|@air7538]] was approved.
* 00:43 [[gitlab:anzx|@anzx]] was approved.
* 00:43 [[gitlab:tekask1903|@tekask1903]] was approved.
* 00:42 [[gitlab:kiwi-0x010c|@kiwi-0x010c]] was approved.
* 00:42 [[gitlab:mpaa|@mpaa]] was approved.
* 00:42 [[gitlab:kutay|@kutay]] was approved.
* 00:42 [[gitlab:wattmto|@wattmto]] was approved.
6l0mj8oakgirtra4dvgiartz6yxlhcs
2396616
2396606
2026-03-29T11:36:24Z
Gitlabaccountapprovalbot
37332
@giftcup was approved.
2396616
wikitext
text/x-wiki
<noinclude>'''Audit log of approvals''' made by [[gitlab:gitlabaccountapprovalbot|@gitlabaccountapprovalbot]]. __NOTOC__</noinclude>
=== 2026-03-29 ===
* 11:36 [[gitlab:giftcup|@giftcup]] was approved.
=== 2026-03-28 ===
* 14:51 [[gitlab:janeeva1|@janeeva1]] was approved.
=== 2026-03-26 ===
* 13:36 [[gitlab:saiphani02|@saiphani02]] was approved.
* 11:48 [[gitlab:valerioboz-wmch|@valerioboz-wmch]] was approved.
=== 2026-03-25 ===
* 09:45 "quansi" was rejected (pending since 2025-12-24T09:42:13.451Z).
* 02:18 [[gitlab:viztor|@viztor]] was approved.
=== 2026-03-24 ===
* 23:18 [[gitlab:maryyann|@maryyann]] was approved.
* 23:01 [[gitlab:codenamenoreste|@codenamenoreste]] was approved.
* 13:36 [[gitlab:marc-maillard-wmse|@marc-maillard-wmse]] was approved.
* 07:39 "fred2675" was rejected (pending since 2025-12-23T07:39:11.380Z).
=== 2026-03-23 ===
* 14:51 [[gitlab:komla|@komla]] was approved.
* 05:51 "lunachuck43" was rejected (pending since 2025-12-22T05:50:17.862Z).
* 04:06 "reza110011" was rejected (pending since 2025-12-22T04:05:25.117Z).
=== 2026-03-20 ===
* 21:54 "mertgor" was rejected (pending since 2025-12-19T21:51:51.419Z).
* 20:57 "autanmahmah" was rejected (pending since 2025-12-19T20:54:51.678Z).
* 09:57 [[gitlab:nethahussain|@nethahussain]] was approved.
* 09:27 [[gitlab:piewriter|@piewriter]] was approved.
* 08:15 [[gitlab:dondersmooi|@dondersmooi]] was approved.
=== 2026-03-19 ===
* 21:03 "sayvhior" was rejected (pending since 2025-12-18T21:02:31.699Z).
=== 2026-03-18 ===
* 20:15 [[gitlab:martinmystere|@martinmystere]] was approved.
=== 2026-03-17 ===
* 02:51 "louperivois" was rejected (pending since 2025-12-16T02:50:48.197Z).
=== 2026-03-16 ===
* 12:54 "mokayaj857" was rejected (pending since 2025-12-15T12:53:39.015Z).
* 06:18 "roamer15" was rejected (pending since 2025-12-15T06:16:38.042Z).
=== 2026-03-14 ===
* 11:12 "umaramuhammad" was rejected (pending since 2025-12-13T11:10:44.004Z).
* 09:33 "akuma19" was rejected (pending since 2025-12-13T09:31:39.044Z).
* 07:06 [[gitlab:syunsyunminmin|@syunsyunminmin]] was approved.
=== 2026-03-12 ===
* 20:24 [[gitlab:11wb|@11wb]] was approved.
* 09:54 [[gitlab:bcxfu75k|@bcxfu75k]] was approved.
=== 2026-03-10 ===
* 09:12 [[gitlab:viktoriahillerudwmse|@viktoriahillerudwmse]] was approved.
=== 2026-03-06 ===
* 08:09 "vazhayilnewone" was rejected (pending since 2025-12-05T08:07:02.184Z).
=== 2026-03-04 ===
* 20:54 [[gitlab:elphie|@elphie]] was approved.
* 11:39 "ronaldahmed" was rejected (pending since 2025-12-03T11:37:47.492Z).
* 02:12 "ltslw" was rejected (pending since 2025-12-03T02:11:52.040Z).
=== 2026-03-02 ===
* 19:21 "dlopez350" was rejected (pending since 2025-12-01T19:20:38.918Z).
* 18:15 [[gitlab:lsandergreen|@lsandergreen]] was approved.
=== 2026-03-01 ===
* 10:51 [[gitlab:clintacc|@clintacc]] was approved.
=== 2026-02-28 ===
* 09:24 "cardboardlamp" was rejected (pending since 2025-11-29T09:22:03.947Z).
* 08:18 "wiki-pavan" was rejected (pending since 2025-11-29T08:16:24.184Z).
=== 2026-02-27 ===
* 20:45 "thisisrick25" was rejected (pending since 2025-11-28T20:42:24.454Z).
=== 2026-02-26 ===
* 13:57 "chuiimuiiofc" was rejected (pending since 2025-11-27T13:57:02.794Z).
* 13:54 "steffpro" was rejected (pending since 2025-11-27T13:52:10.859Z).
=== 2026-02-25 ===
* 21:24 "abubakarhabibudayyabu" was rejected (pending since 2025-11-26T21:22:37.776Z).
=== 2026-02-24 ===
* 05:00 "playboi" was rejected (pending since 2025-11-25T05:00:30.762Z).
=== 2026-02-23 ===
* 14:00 "alph65" was rejected (pending since 2025-11-24T13:59:00.797Z).
* 12:33 [[gitlab:robertsky|@robertsky]] was approved.
=== 2026-02-22 ===
* 00:30 "hp8p" was rejected (pending since 2025-11-23T00:29:24.741Z).
=== 2026-02-19 ===
* 16:45 "clayjar" was rejected (pending since 2025-11-20T16:44:48.380Z).
=== 2026-02-18 ===
* 22:18 "nexus" was rejected (pending since 2025-11-19T22:16:48.818Z).
* 12:00 "bernsteinnn" was rejected (pending since 2025-11-19T11:59:04.427Z).
=== 2026-02-17 ===
* 11:36 "jason2000-cpu" was rejected (pending since 2025-11-18T11:34:00.314Z).
=== 2026-02-16 ===
* 14:54 "smaurya" was rejected (pending since 2025-11-17T14:52:06.906Z).
=== 2026-02-15 ===
* 16:51 "kra-79" was rejected (pending since 2025-11-16T16:50:41.375Z).
=== 2026-02-14 ===
* 15:15 [[gitlab:mess|@mess]] was approved.
=== 2026-02-13 ===
* 13:57 "sopalsuemae957" was rejected (pending since 2025-11-14T13:55:16.921Z).
* 13:30 [[gitlab:wyslijp16-toolforge|@wyslijp16-toolforge]] was approved.
=== 2026-02-12 ===
* 16:30 "kristinagligoric" was rejected (pending since 2025-11-13T16:29:21.646Z).
* 03:33 [[gitlab:anyehansen|@anyehansen]] was approved.
* 02:21 [[gitlab:thejoyfultentmaker|@thejoyfultentmaker]] was approved.
=== 2026-02-10 ===
* 13:18 [[gitlab:db111|@db111]] was approved.
=== 2026-02-09 ===
* 19:06 "squirrel289" was rejected (pending since 2025-11-10T19:04:27.831Z).
=== 2026-02-06 ===
* 20:54 [[gitlab:gillux|@gillux]] was approved.
* 09:09 [[gitlab:lih|@lih]] was approved.
=== 2026-01-31 ===
* 16:21 [[gitlab:taxonbot1|@taxonbot1]] was approved.
=== 2026-01-28 ===
* 14:30 [[gitlab:ademola|@ademola]] was approved.
* 10:51 "watshell" was rejected (pending since 2025-10-29T10:51:01.521Z).
=== 2026-01-26 ===
* 23:06 "tavaresgmg" was rejected (pending since 2025-10-27T23:04:42.140Z).
=== 2026-01-25 ===
* 06:03 "cata" was rejected (pending since 2025-10-26T06:01:26.155Z).
=== 2026-01-24 ===
* 21:15 [[gitlab:wiegels|@wiegels]] was approved.
* 06:30 [[gitlab:blaquans|@blaquans]] was approved.
=== 2026-01-23 ===
* 16:27 [[gitlab:lerickson|@lerickson]] was approved.
* 10:15 "fran0035g" was rejected (pending since 2025-10-24T10:12:17.732Z).
=== 2026-01-22 ===
* 21:00 "hacksyn" was rejected (pending since 2025-10-23T20:59:15.982Z).
=== 2026-01-21 ===
* 17:30 [[gitlab:otcenas11|@otcenas11]] was approved.
=== 2026-01-19 ===
* 21:48 [[gitlab:amdrel|@amdrel]] was approved.
* 04:36 "rayalexa" was rejected (pending since 2025-10-20T04:35:02.094Z).
=== 2026-01-18 ===
* 15:45 "somya" was rejected (pending since 2025-10-19T15:43:43.701Z).
* 06:54 "sergg001" was rejected (pending since 2025-10-19T06:54:12.296Z).
=== 2026-01-16 ===
* 11:57 "zeejohsy" was rejected (pending since 2025-10-17T11:56:22.372Z).
* 04:45 "rocky25" was rejected (pending since 2025-10-17T04:43:33.180Z).
=== 2026-01-15 ===
* 16:39 "tiisu" was rejected (pending since 2025-10-16T16:37:18.438Z).
* 12:00 "noahalorwu" was rejected (pending since 2025-10-16T11:58:26.133Z).
* 10:39 "prjayaiuedu" was rejected (pending since 2025-10-16T10:37:16.947Z).
=== 2026-01-13 ===
* 17:21 [[gitlab:lwilson-ctr|@lwilson-ctr]] was approved.
=== 2026-01-12 ===
* 17:03 "stagietechs" was rejected (pending since 2025-10-13T17:02:25.281Z).
=== 2026-01-10 ===
* 19:06 "keerthisr" was rejected (pending since 2025-10-11T19:05:01.758Z).
=== 2026-01-09 ===
* 20:36 "lightb" was rejected (pending since 2025-10-10T20:34:20.264Z).
=== 2026-01-08 ===
* 19:42 [[gitlab:tbodt|@tbodt]] was approved.
* 13:57 [[gitlab:martynranyard|@martynranyard]] was approved.
=== 2026-01-07 ===
* 17:48 [[gitlab:santanuwiki25|@santanuwiki25]] was approved.
* 14:27 "dipanshu" was rejected (pending since 2025-10-08T14:26:10.794Z).
* 12:30 "adeolaadesina" was rejected (pending since 2025-10-08T12:29:49.592Z).
* 09:21 "tony-kamande" was rejected (pending since 2025-10-08T09:20:28.421Z).
* 06:18 "hninwuttyi" was rejected (pending since 2025-10-08T06:17:28.006Z).
* 05:09 "andume" was rejected (pending since 2025-10-08T05:07:18.582Z).
* 02:00 "mosope" was rejected (pending since 2025-10-08T01:59:54.800Z).
* 01:15 [[gitlab:tungstalite|@tungstalite]] was approved.
=== 2026-01-06 ===
* 18:24 "leerensucher" was rejected (pending since 2025-10-07T18:21:41.253Z).
* 14:54 "leonidlednev" was rejected (pending since 2025-10-07T14:53:07.273Z).
* 12:57 "alexandre-tingaud" was rejected (pending since 2025-10-07T12:54:27.206Z).
=== 2026-01-04 ===
* 21:33 [[gitlab:matr1x-101|@matr1x-101]] was approved.
* 15:18 "makjr" was rejected (pending since 2025-10-05T15:16:31.558Z).
* 14:09 "dakshq" was rejected (pending since 2025-10-05T14:08:40.608Z).
=== 2026-01-03 ===
* 20:42 [[gitlab:apehitkey|@apehitkey]] was approved.
* 18:00 [[gitlab:jeremyb|@jeremyb]] was approved.
* 14:09 [[gitlab:twelephant|@twelephant]] was approved.
=== 2026-01-01 ===
* 11:30 "shellstanislav" was rejected (pending since 2025-10-02T11:29:10.150Z).
=== 2025-12-30 ===
* 19:51 "camilojdiaz" was rejected (pending since 2025-09-30T19:49:24.913Z).
=== 2025-12-29 ===
* 16:03 "zied" was rejected (pending since 2025-09-29T16:01:30.415Z).
* 08:18 "rahulsidpradhan" was rejected (pending since 2025-09-29T08:17:02.849Z).
=== 2025-12-26 ===
* 09:48 "thembo42" was rejected (pending since 2025-09-26T09:45:15.033Z).
=== 2025-12-25 ===
* 14:03 "196936074751" was rejected (pending since 2025-09-25T14:02:31.367Z).
=== 2025-12-23 ===
* 16:21 "ngarnsworthy" was rejected (pending since 2025-09-23T16:20:41.211Z).
=== 2025-12-22 ===
* 12:39 "aza555" was rejected (pending since 2025-09-22T12:38:02.622Z).
=== 2025-12-20 ===
* 23:45 "saph" was rejected (pending since 2025-09-20T23:45:01.222Z).
=== 2025-12-19 ===
* 10:15 "vladdymoses" was rejected (pending since 2025-09-19T10:15:00.999Z).
* 07:15 "dirtylittlepoobah" was rejected (pending since 2025-09-19T07:13:55.537Z).
=== 2025-12-18 ===
* 16:24 [[gitlab:guyfawcus|@guyfawcus]] was approved.
=== 2025-12-17 ===
* 21:39 [[gitlab:holdyourhorses|@holdyourhorses]] was approved.
* 18:30 "prudencia" was rejected (pending since 2025-09-17T18:27:18.860Z).
* 02:24 "lottie" was rejected (pending since 2025-09-17T02:21:21.744Z).
=== 2025-12-16 ===
* 09:39 [[gitlab:melcatherine|@melcatherine]] was approved.
* 08:54 [[gitlab:leila237|@leila237]] was approved.
=== 2025-12-15 ===
* 18:27 [[gitlab:royalsailor|@royalsailor]] was approved.
* 09:39 [[gitlab:olaf8940|@olaf8940]] was approved.
* 09:39 "brianbybyby" was rejected (pending since 2025-09-15T09:37:45.430Z).
=== 2025-12-14 ===
* 20:21 [[gitlab:essa237|@essa237]] was approved.
* 16:42 [[gitlab:bovimacoco|@bovimacoco]] was approved.
=== 2025-12-13 ===
* 21:54 "mmns21" was rejected (pending since 2025-09-13T21:52:24.017Z).
* 20:33 "bugcrawler" was rejected (pending since 2025-09-13T20:31:09.211Z).
=== 2025-12-12 ===
* 14:39 "ruvchoudhary" was rejected (pending since 2025-09-12T14:36:16.167Z).
* 06:54 "rezadress" was rejected (pending since 2025-09-12T06:52:21.749Z).
=== 2025-12-10 ===
* 17:30 [[gitlab:itsmoon|@itsmoon]] was approved.
=== 2025-12-09 ===
* 15:42 [[gitlab:mercy-o|@mercy-o]] was approved.
=== 2025-12-06 ===
* 16:45 "jacquesradjabu" was rejected (pending since 2025-09-06T16:45:17.969Z).
* 11:27 [[gitlab:ikhitron|@ikhitron]] was approved.
=== 2025-12-01 ===
* 08:12 "halconmilenario21" was rejected (pending since 2025-09-01T08:12:10.262Z).
=== 2025-11-30 ===
* 21:06 [[gitlab:habs|@habs]] was approved.
=== 2025-11-29 ===
* 16:36 "bovimacoco" was rejected (pending since 2025-08-30T16:34:39.712Z).
* 00:45 [[gitlab:jjpmaster|@jjpmaster]] was approved.
=== 2025-11-24 ===
* 10:30 "alph65" was rejected (pending since 2025-08-25T10:28:40.957Z).
* 02:24 [[gitlab:yaron|@yaron]] was approved.
=== 2025-11-20 ===
* 16:06 "clayjar" was rejected (pending since 2025-08-21T16:04:54.450Z).
=== 2025-11-17 ===
* 21:09 [[gitlab:ankita97531|@ankita97531]] was approved.
=== 2025-11-16 ===
* 14:15 "commanderkefir" was rejected (pending since 2025-08-17T14:13:14.791Z).
* 08:21 "rehankhan78" was rejected (pending since 2025-08-17T08:19:44.896Z).
=== 2025-11-15 ===
* 14:36 "cyberscribe" was rejected (pending since 2025-08-16T14:34:27.230Z).
=== 2025-11-13 ===
* 04:21 "waddie96" was rejected (pending since 2025-08-14T04:19:27.461Z).
=== 2025-11-11 ===
* 06:42 [[gitlab:seanhoyland|@seanhoyland]] was approved.
=== 2025-11-10 ===
* 00:06 [[gitlab:jaredblumer|@jaredblumer]] was approved.
=== 2025-11-09 ===
* 22:36 "heinxiety" was rejected (pending since 2025-08-10T22:33:12.041Z).
=== 2025-11-07 ===
* 22:00 [[gitlab:forzagreen|@forzagreen]] was approved.
=== 2025-11-06 ===
* 16:57 [[gitlab:rsilvola|@rsilvola]] was approved.
=== 2025-11-04 ===
* 21:24 [[gitlab:devdoingdev|@devdoingdev]] was approved.
=== 2025-11-03 ===
* 17:48 "joewaleed98" was rejected (pending since 2025-08-04T17:46:12.191Z).
=== 2025-11-01 ===
* 18:00 "eliasempresas" was rejected (pending since 2025-08-02T17:58:04.412Z).
=== 2025-10-31 ===
* 18:51 [[gitlab:chaoticenby|@chaoticenby]] was approved.
* 04:33 "3ch310n" was rejected (pending since 2025-08-01T04:32:21.982Z).
=== 2025-10-30 ===
* 10:03 [[gitlab:tausheefhassan|@tausheefhassan]] was approved.
=== 2025-10-29 ===
* 14:54 "theap" was rejected (pending since 2025-07-30T14:52:12.066Z).
=== 2025-10-28 ===
* 06:06 [[gitlab:tanbiruzzaman|@tanbiruzzaman]] was approved.
=== 2025-10-27 ===
* 07:51 [[gitlab:jmoore111|@jmoore111]] was approved.
=== 2025-10-25 ===
* 21:09 [[gitlab:valor|@valor]] was approved.
* 21:03 [[gitlab:booksmurf|@booksmurf]] was approved.
* 02:48 "mystyc1" was rejected (pending since 2025-07-26T02:46:19.373Z).
=== 2025-10-24 ===
* 05:12 "aadarshmahesh" was rejected (pending since 2025-07-25T05:09:38.264Z).
=== 2025-10-22 ===
* 20:54 [[gitlab:janewanga|@janewanga]] was approved.
* 17:27 "abeljeevan" was rejected (pending since 2025-07-23T17:26:46.884Z).
* 16:12 "shrimpnaur" was rejected (pending since 2025-07-23T16:10:37.864Z).
=== 2025-10-21 ===
* 18:51 "jrmuizel" was rejected (pending since 2025-07-22T18:50:07.315Z).
* 09:33 [[gitlab:dpogorzelski|@dpogorzelski]] was approved.
=== 2025-10-17 ===
* 13:21 [[gitlab:blegodwin|@blegodwin]] was approved.
=== 2025-10-16 ===
* 14:51 [[gitlab:bahago|@bahago]] was approved.
* 14:12 "harikrishna0005" was rejected (pending since 2025-07-17T14:10:48.385Z).
* 14:09 "gauthammohanraj" was rejected (pending since 2025-07-17T14:08:47.643Z).
=== 2025-10-15 ===
* 13:48 [[gitlab:adwivedii|@adwivedii]] was approved.
* 13:18 [[gitlab:kimbrenekakande|@kimbrenekakande]] was approved.
* 13:03 "childmnajennifer" was rejected (pending since 2025-07-16T13:01:50.236Z).
* 05:06 "vssb4214" was rejected (pending since 2025-07-16T05:05:33.985Z).
=== 2025-10-14 ===
* 19:39 [[gitlab:afanyulionel|@afanyulionel]] was approved.
* 15:33 [[gitlab:sadrettin|@sadrettin]] was approved.
* 14:18 [[gitlab:tmwyk|@tmwyk]] was approved.
* 08:42 "yasu0796" was rejected (pending since 2025-07-15T08:41:26.453Z).
=== 2025-10-13 ===
* 16:09 [[gitlab:atlas0007|@atlas0007]] was approved.
=== 2025-10-11 ===
* 17:42 [[gitlab:techwizzie|@techwizzie]] was approved.
=== 2025-10-10 ===
* 19:03 [[gitlab:miiswom|@miiswom]] was approved.
* 16:06 [[gitlab:ninatakang|@ninatakang]] was approved.
=== 2025-10-09 ===
* 15:42 [[gitlab:jaykaneki|@jaykaneki]] was approved.
* 14:21 [[gitlab:lebogang|@lebogang]] was approved.
* 14:15 [[gitlab:kimondorose|@kimondorose]] was approved.
* 13:48 [[gitlab:joyakinyi|@joyakinyi]] was approved.
* 13:48 [[gitlab:dikshyashahi|@dikshyashahi]] was approved.
* 13:45 [[gitlab:obediobadiah|@obediobadiah]] was approved.
* 13:45 [[gitlab:system625|@system625]] was approved.
* 13:45 [[gitlab:rolalove|@rolalove]] was approved.
* 13:39 [[gitlab:olatundeawo|@olatundeawo]] was approved.
* 13:36 [[gitlab:danielchristlight|@danielchristlight]] was approved.
* 13:36 [[gitlab:dipanshu1223|@dipanshu1223]] was approved.
* 13:36 [[gitlab:aradhya|@aradhya]] was approved.
* 09:57 "bognd" was rejected (pending since 2025-07-10T09:55:48.661Z).
=== 2025-10-08 ===
* 23:36 [[gitlab:sopzy|@sopzy]] was approved.
* 23:03 [[gitlab:oluwatumininu|@oluwatumininu]] was approved.
* 19:39 [[gitlab:levon003|@levon003]] was approved.
* 15:24 [[gitlab:ritika-bhambri11|@ritika-bhambri11]] was approved.
* 13:45 [[gitlab:anbanguyen|@anbanguyen]] was approved.
* 13:36 [[gitlab:chumzine|@chumzine]] was approved.
* 13:27 [[gitlab:shr0x-ya|@shr0x-ya]] was approved.
* 12:45 [[gitlab:nurahwakili|@nurahwakili]] was approved.
* 03:42 "nazhiba" was rejected (pending since 2025-07-09T03:40:12.625Z).
* 02:12 "mafennel" was rejected (pending since 2025-07-09T02:11:40.598Z).
=== 2025-10-07 ===
* 22:54 [[gitlab:olusegunfaj|@olusegunfaj]] was approved.
* 21:30 [[gitlab:rona|@rona]] was approved.
* 21:09 [[gitlab:sandijigs|@sandijigs]] was approved.
* 13:36 "xisbajao" was rejected (pending since 2025-07-08T13:33:35.018Z).
* 01:36 "areczek94" was rejected (pending since 2025-07-08T01:35:40.633Z).
=== 2025-10-06 ===
* 19:21 "wmcarter2017" was rejected (pending since 2025-07-07T19:21:12.899Z).
=== 2025-10-05 ===
* 14:15 "meetmendapara" was rejected (pending since 2025-07-06T14:14:16.726Z).
=== 2025-10-04 ===
* 20:51 "nftbaee" was rejected (pending since 2025-07-05T20:50:57.688Z).
=== 2025-10-03 ===
* 06:12 [[gitlab:javiermonton|@javiermonton]] was approved.
=== 2025-10-02 ===
* 20:15 "talaqalotaibipmp" was rejected (pending since 2025-07-03T20:13:05.164Z).
=== 2025-10-01 ===
* 10:54 "bjensen" was rejected (pending since 2025-07-02T10:53:46.574Z).
* 02:45 "kowal1984" was rejected (pending since 2025-07-02T02:44:56.946Z).
=== 2025-09-30 ===
* 21:21 [[gitlab:kavaljeetsingh|@kavaljeetsingh]] was approved.
* 00:24 "adium" was rejected (pending since 2025-07-01T00:23:43.807Z).
=== 2025-09-28 ===
* 08:54 [[gitlab:pexerik|@pexerik]] was approved.
=== 2025-09-27 ===
* 13:57 [[gitlab:rubahhitamvukova|@rubahhitamvukova]] was approved.
=== 2025-09-26 ===
* 16:57 "algorithmic" was rejected (pending since 2025-06-27T16:56:17.480Z).
* 13:54 [[gitlab:shadabgdg|@shadabgdg]] was approved.
* 13:12 [[gitlab:spushpit|@spushpit]] was approved.
=== 2025-09-20 ===
* 14:06 "bwiki" was rejected (pending since 2025-06-21T13:59:14.749Z).
=== 2025-09-16 ===
* 05:39 [[gitlab:deepchirp|@deepchirp]] was approved.
=== 2025-09-15 ===
* 22:00 [[gitlab:noisk8|@noisk8]] was approved.
* 11:03 "ahonc" was rejected (pending since 2025-06-16T11:00:54.843Z).
=== 2025-09-13 ===
* 18:24 "a-ssh22" was rejected (pending since 2025-06-14T18:23:33.937Z).
* 12:36 [[gitlab:rajashreetalukdar|@rajashreetalukdar]] was approved.
* 00:45 [[gitlab:sumitsurai|@sumitsurai]] was approved.
=== 2025-09-12 ===
* 17:12 [[gitlab:suyash23|@suyash23]] was approved.
* 00:46 "remotetravel" was rejected (pending since 2025-06-13T00:44:08.171Z).
=== 2025-09-10 ===
* 21:09 "jancborchardt" was rejected (pending since 2025-06-11T21:06:30.759Z).
=== 2025-09-09 ===
* 17:03 [[gitlab:vwf|@vwf]] was approved.
* 06:36 [[gitlab:cactusisme|@cactusisme]] was approved.
=== 2025-09-08 ===
* 18:09 "birushandegeya" was rejected (pending since 2025-06-09T18:08:00.087Z).
* 16:27 "ngarnsworthy" was rejected (pending since 2025-06-09T16:24:37.213Z).
* 12:33 "zolgoyo" was rejected (pending since 2025-06-09T12:31:34.199Z).
=== 2025-09-06 ===
* 23:09 [[gitlab:jaishsingh913|@jaishsingh913]] was approved.
=== 2025-09-05 ===
* 21:45 [[gitlab:sakshi2|@sakshi2]] was approved.
* 20:42 "abdukhaliq1" was rejected (pending since 2025-06-06T20:40:42.023Z).
* 14:27 "beubsamy" was rejected (pending since 2025-06-06T14:27:06.781Z).
=== 2025-09-04 ===
* 23:27 "sdhehua" was rejected (pending since 2025-06-05T23:24:45.777Z).
* 19:00 [[gitlab:perry|@perry]] was approved.
* 11:24 "saintwolf" was rejected (pending since 2025-06-05T11:21:20.176Z).
=== 2025-09-02 ===
* 05:48 [[gitlab:aliu|@aliu]] was approved.
=== 2025-08-29 ===
* 13:30 "kksurendran066" was rejected (pending since 2025-05-30T13:27:48.755Z).
=== 2025-08-28 ===
* 22:18 "tauraamuix" was rejected (pending since 2025-05-29T22:16:08.228Z).
=== 2025-08-26 ===
* 19:03 [[gitlab:dikkulah|@dikkulah]] was approved.
=== 2025-08-22 ===
* 23:51 [[gitlab:khoroshun_mike|@khoroshun_mike]] was approved.
=== 2025-08-21 ===
* 07:39 [[gitlab:yuka|@yuka]] was approved.
=== 2025-08-19 ===
* 07:48 [[gitlab:zhaofjx|@zhaofjx]] was approved.
=== 2025-08-17 ===
* 14:27 "madhan13k" was rejected (pending since 2025-05-18T14:26:08.973Z).
=== 2025-08-15 ===
* 10:15 "mohammed_abukhadra" was rejected (pending since 2025-05-16T10:14:48.403Z).
=== 2025-08-11 ===
* 11:48 "hmmyesbro" was rejected (pending since 2025-05-12T11:45:24.350Z).
=== 2025-08-10 ===
* 13:15 [[gitlab:dactyl|@dactyl]] was approved.
=== 2025-08-09 ===
* 04:39 "xxxx100000" was rejected (pending since 2025-05-10T04:37:44.949Z).
=== 2025-08-08 ===
* 14:33 [[gitlab:josefanthony|@josefanthony]] was approved.
=== 2025-08-07 ===
* 23:42 [[gitlab:robins7|@robins7]] was approved.
* 21:42 [[gitlab:pols12|@pols12]] was approved.
* 17:15 "sbronson" was rejected (pending since 2025-05-08T17:15:08.834Z).
* 14:57 [[gitlab:alvindulle|@alvindulle]] was approved.
* 14:45 [[gitlab:xentos|@xentos]] was approved.
* 06:27 "jamesboste" was rejected (pending since 2025-05-08T06:25:14.793Z).
* 03:57 "ysun" was rejected (pending since 2025-05-08T03:55:07.348Z).
=== 2025-08-06 ===
* 21:51 "pols12" was rejected (pending since 2025-05-07T21:49:13.598Z).
* 01:51 "okeamah" was rejected (pending since 2025-05-07T01:48:50.114Z).
=== 2025-08-05 ===
* 09:15 "mobashir-2013" was rejected (pending since 2025-05-06T09:14:24.069Z).
=== 2025-08-01 ===
* 08:00 "douginamug" was rejected (pending since 2025-05-02T07:57:38.317Z).
=== 2025-07-31 ===
* 02:30 [[gitlab:ads|@ads]] was approved.
=== 2025-07-27 ===
* 13:15 "mrico2703" was rejected (pending since 2025-04-27T13:13:12.346Z).
* 10:17 [[gitlab:josephfrancis12|@josephfrancis12]] was approved.
* 10:17 [[gitlab:fuzzew|@fuzzew]] was approved.
* 05:57 [[gitlab:biscuitbobby|@biscuitbobby]] was approved.
* 05:48 [[gitlab:ecoholic|@ecoholic]] was approved.
=== 2025-07-26 ===
* 11:48 [[gitlab:chimnayyyy|@chimnayyyy]] was approved.
* 11:48 [[gitlab:alwinalbert|@alwinalbert]] was approved.
* 11:48 [[gitlab:hridyakk|@hridyakk]] was approved.
* 11:45 [[gitlab:gaurigupta21|@gaurigupta21]] was approved.
* 11:45 [[gitlab:binetaa|@binetaa]] was approved.
* 10:21 [[gitlab:jyothikat22|@jyothikat22]] was approved.
* 10:21 [[gitlab:zobotrombie|@zobotrombie]] was approved.
* 10:21 [[gitlab:flykrth|@flykrth]] was approved.
* 10:21 [[gitlab:mehrinshamim|@mehrinshamim]] was approved.
* 10:21 [[gitlab:aadhi13|@aadhi13]] was approved.
* 10:21 [[gitlab:malavikam05|@malavikam05]] was approved.
* 10:18 [[gitlab:nf609|@nf609]] was approved.
* 05:48 [[gitlab:nazalnihad|@nazalnihad]] was approved.
* 05:48 [[gitlab:naveen28204280|@naveen28204280]] was approved.
=== 2025-07-25 ===
* 09:49 [[gitlab:kasyap9|@kasyap9]] was approved.
* 09:30 [[gitlab:swayamagrahari|@swayamagrahari]] was approved.
=== 2025-07-24 ===
* 19:36 [[gitlab:madutgn|@madutgn]] was approved.
=== 2025-07-23 ===
* 20:09 [[gitlab:somerandomdeveloper|@somerandomdeveloper]] was approved.
=== 2025-07-22 ===
* 00:15 [[gitlab:iagoqnsi|@iagoqnsi]] was approved.
=== 2025-07-21 ===
* 17:30 [[gitlab:asadiqui|@asadiqui]] was approved.
* 16:39 [[gitlab:tryvix1509|@tryvix1509]] was approved.
* 04:27 [[gitlab:damian|@damian]] was approved.
=== 2025-07-20 ===
* 09:42 "mike-khoroshun" was rejected (pending since 2025-04-20T09:42:22.732Z).
=== 2025-07-17 ===
* 17:57 [[gitlab:haroldkrabs|@haroldkrabs]] was approved.
* 13:45 [[gitlab:envlh|@envlh]] was approved.
=== 2025-07-14 ===
* 10:24 [[gitlab:missguru|@missguru]] was approved.
* 00:57 "clarfonthey" was rejected (pending since 2025-04-14T00:56:32.626Z).
=== 2025-07-13 ===
* 01:01 [[gitlab:l235|@l235]] was approved.
=== 2025-07-11 ===
* 03:06 "rodavlas" was rejected (pending since 2025-04-11T03:05:45.590Z).
=== 2025-07-06 ===
* 00:09 "lakasa" was rejected (pending since 2025-04-06T00:06:28.469Z).
=== 2025-07-05 ===
* 21:54 "ctrlzvi" was rejected (pending since 2025-04-05T21:54:12.542Z).
* 14:30 "aminualiyu" was rejected (pending since 2025-04-05T14:27:22.617Z).
=== 2025-07-04 ===
* 03:15 [[gitlab:galstar|@galstar]] was approved.
=== 2025-07-02 ===
* 11:27 "vicolas11" was rejected (pending since 2025-04-02T11:25:12.682Z).
=== 2025-06-29 ===
* 23:12 "naomi723" was rejected (pending since 2025-03-30T23:09:24.630Z).
=== 2025-06-28 ===
* 16:21 "mudeh2372" was rejected (pending since 2025-03-29T16:18:27.057Z).
=== 2025-06-27 ===
* 23:18 "rony143" was rejected (pending since 2025-03-28T23:16:13.671Z).
* 22:21 [[gitlab:rluts|@rluts]] was approved.
=== 2025-06-26 ===
* 13:54 "creativegurus" was rejected (pending since 2025-03-27T13:52:41.706Z).
=== 2025-06-24 ===
* 17:42 [[gitlab:devjadiya|@devjadiya]] was approved.
* 14:00 "dominic-r" was rejected (pending since 2025-03-25T14:00:07.307Z).
=== 2025-06-21 ===
* 00:48 [[gitlab:vriaa|@vriaa]] was approved.
=== 2025-06-18 ===
* 15:21 "ayushkhati1" was rejected (pending since 2025-03-19T15:18:50.062Z).
=== 2025-06-17 ===
* 20:45 "chiomavero" was rejected (pending since 2025-03-18T20:44:13.967Z).
* 00:27 [[gitlab:eggroll97|@eggroll97]] was approved.
=== 2025-06-14 ===
* 20:57 "volvox" was rejected (pending since 2025-03-15T20:56:34.018Z).
=== 2025-06-13 ===
* 16:09 [[gitlab:supergrey|@supergrey]] was approved.
* 11:03 "chqaz" was rejected (pending since 2025-03-14T11:01:09.600Z).
* 10:24 [[gitlab:slong-wmf|@slong-wmf]] was approved.
* 10:15 "hearvox" was rejected (pending since 2025-03-14T10:13:13.112Z).
=== 2025-06-12 ===
* 15:18 "jlam" was rejected (pending since 2025-03-13T15:17:54.099Z).
=== 2025-06-09 ===
* 20:48 "dipanjansengupta" was rejected (pending since 2025-03-10T20:48:03.545Z).
* 19:27 [[gitlab:reggycelly|@reggycelly]] was approved.
* 14:51 "arendpieter" was rejected (pending since 2025-03-10T14:51:01.445Z).
* 13:21 [[gitlab:greenreaper|@greenreaper]] was approved.
* 09:33 [[gitlab:mmta|@mmta]] was approved.
* 08:03 "a-ssh22" was rejected (pending since 2025-03-10T08:03:08.111Z).
=== 2025-06-08 ===
* 21:06 "mm-episodenlistedlvaupdater" was rejected (pending since 2025-03-09T21:04:06.323Z).
=== 2025-06-06 ===
* 11:06 [[gitlab:olea|@olea]] was approved.
=== 2025-06-05 ===
* 20:33 [[gitlab:encodedwp|@encodedwp]] was approved.
* 15:00 [[gitlab:toluayo|@toluayo]] was approved.
* 13:51 [[gitlab:arnold_lup|@arnold_lup]] was approved.
* 11:54 "sdhehua" was rejected (pending since 2025-03-06T11:51:48.241Z).
=== 2025-06-03 ===
* 21:27 [[gitlab:wewakey|@wewakey]] was approved.
* 12:36 "hunsimon2" was rejected (pending since 2025-03-04T12:34:56.520Z).
* 11:54 "hunsimon" was rejected (pending since 2025-03-04T11:53:54.652Z).
=== 2025-06-02 ===
* 12:01 [[gitlab:jaimedes|@jaimedes]] was approved.
=== 2025-05-30 ===
* 18:00 "sathvik9105" was rejected (pending since 2025-02-28T17:59:42.867Z).
* 11:21 [[gitlab:tonythomas01|@tonythomas01]] was approved.
* 10:06 [[gitlab:gpsleo|@gpsleo]] was approved.
=== 2025-05-29 ===
* 22:12 [[gitlab:codynguyen1116|@codynguyen1116]] was approved.
=== 2025-05-28 ===
* 02:57 [[gitlab:saper|@saper]] was approved.
=== 2025-05-27 ===
* 21:06 [[gitlab:mohammed_qays|@mohammed_qays]] was approved.
* 15:33 "satanluimm" was rejected (pending since 2025-02-25T15:32:48.101Z).
=== 2025-05-26 ===
* 23:57 "seyedali220" was rejected (pending since 2025-02-24T23:56:17.621Z).
=== 2025-05-21 ===
* 11:12 [[gitlab:guilherme|@guilherme]] was approved.
=== 2025-05-19 ===
* 13:24 [[gitlab:emojiwiki|@emojiwiki]] was approved.
=== 2025-05-18 ===
* 00:00 "xidme" was rejected (pending since 2025-02-15T23:58:56.796Z).
=== 2025-05-17 ===
* 02:39 "kdh8219" was rejected (pending since 2025-02-15T02:36:32.237Z).
=== 2025-05-16 ===
* 15:09 [[gitlab:maxbinderwmf|@maxbinderwmf]] was approved.
=== 2025-05-15 ===
* 04:30 "inspectorzer0" was rejected (pending since 2025-02-13T04:27:33.179Z).
=== 2025-05-14 ===
* 17:42 [[gitlab:llugo|@llugo]] was approved.
=== 2025-05-13 ===
* 20:18 "mmta" was rejected (pending since 2025-02-11T20:17:23.407Z).
=== 2025-05-11 ===
* 20:51 "jad" was rejected (pending since 2025-02-09T20:49:07.333Z).
* 17:54 "nishchalsundan" was rejected (pending since 2025-02-09T17:52:25.761Z).
* 16:39 "mohammed_abukhadra" was rejected (pending since 2025-02-09T16:39:03.730Z).
=== 2025-05-09 ===
* 09:12 [[gitlab:sirchanmp|@sirchanmp]] was approved.
=== 2025-05-08 ===
* 08:18 [[gitlab:mengeditch|@mengeditch]] was approved.
=== 2025-05-07 ===
* 03:45 "xluffy" was rejected (pending since 2025-02-05T03:45:14.181Z).
=== 2025-05-06 ===
* 16:54 "punhaniabhishek" was rejected (pending since 2025-02-04T16:53:50.758Z).
* 09:36 [[gitlab:bmartinezcalvo|@bmartinezcalvo]] was approved.
=== 2025-05-02 ===
* 12:24 [[gitlab:tohaomg|@tohaomg]] was approved.
* 11:48 [[gitlab:mavrikant|@mavrikant]] was approved.
* 11:45 [[gitlab:daanvr|@daanvr]] was approved.
=== 2025-05-01 ===
* 09:09 "mjoerg" was rejected (pending since 2025-01-30T09:09:04.204Z).
=== 2025-04-30 ===
* 23:06 "sanskardubey" was rejected (pending since 2025-01-29T23:03:25.489Z).
=== 2025-04-29 ===
* 16:00 "geyslein" was rejected (pending since 2025-01-28T16:00:01.510Z).
=== 2025-04-26 ===
* 09:30 "anjali9027" was rejected (pending since 2025-01-25T09:28:07.064Z).
=== 2025-04-25 ===
* 18:00 "salahhazaa" was rejected (pending since 2025-01-24T17:58:30.030Z).
* 15:15 [[gitlab:yiming|@yiming]] was approved.
* 02:06 "mrchanmp" was rejected (pending since 2025-01-24T02:03:58.308Z).
=== 2025-04-23 ===
* 17:03 "rj2904" was rejected (pending since 2025-01-22T17:03:11.207Z).
* 14:21 "nischay33" was rejected (pending since 2025-01-22T14:19:21.081Z).
=== 2025-04-22 ===
* 19:27 "dj80" was rejected (pending since 2025-01-21T19:25:28.498Z).
* 14:30 [[gitlab:kaimamin|@kaimamin]] was approved.
* 09:57 "debo" was rejected (pending since 2025-01-21T09:54:47.955Z).
=== 2025-04-21 ===
* 12:24 "unshell" was rejected (pending since 2025-01-20T12:21:59.686Z).
=== 2025-04-18 ===
* 15:06 [[gitlab:spartanarbinger|@spartanarbinger]] was approved.
=== 2025-04-16 ===
* 03:09 "dewey" was rejected (pending since 2025-01-15T03:06:17.488Z).
=== 2025-04-15 ===
* 19:45 "emdadul" was rejected (pending since 2025-01-14T19:42:29.285Z).
=== 2025-04-14 ===
* 06:45 [[gitlab:bcampbell804|@bcampbell804]] was approved.
=== 2025-04-11 ===
* 06:27 [[gitlab:jvanderhoop|@jvanderhoop]] was approved.
=== 2025-04-10 ===
* 04:12 "bhai420" was rejected (pending since 2025-01-09T04:10:29.430Z).
=== 2025-04-09 ===
* 05:03 "austinvarshney" was rejected (pending since 2025-01-08T05:02:34.175Z).
=== 2025-04-06 ===
* 15:36 [[gitlab:elph|@elph]] was approved.
=== 2025-04-02 ===
* 10:33 [[gitlab:ozge|@ozge]] was approved.
=== 2025-03-31 ===
* 20:15 "demandkey" was rejected (pending since 2024-12-30T20:14:23.096Z).
* 15:18 [[gitlab:danyya|@danyya]] was approved.
=== 2025-03-28 ===
* 15:54 [[gitlab:rutsavi09|@rutsavi09]] was approved.
* 15:54 [[gitlab:ilanen1|@ilanen1]] was approved.
=== 2025-03-25 ===
* 19:27 [[gitlab:irfo|@irfo]] was approved.
* 11:54 [[gitlab:kmontalva-wmf|@kmontalva-wmf]] was approved.
* 04:33 [[gitlab:paul26|@paul26]] was approved.
* 04:18 "as1100k" was rejected (pending since 2024-12-24T04:18:06.813Z).
=== 2025-03-24 ===
* 11:33 "amzadkhankk" was rejected (pending since 2024-12-23T11:33:14.176Z).
=== 2025-03-23 ===
* 12:24 "wolfdo" was rejected (pending since 2024-12-22T12:23:35.056Z).
=== 2025-03-22 ===
* 09:45 [[gitlab:fjmustak|@fjmustak]] was approved.
=== 2025-03-20 ===
* 18:42 "sathishkokila" was rejected (pending since 2024-12-19T18:39:35.161Z).
* 17:03 [[gitlab:alien4444|@alien4444]] was approved.
* 15:27 [[gitlab:davidcoronel|@davidcoronel]] was approved.
=== 2025-03-19 ===
* 22:57 [[gitlab:r1f4t|@r1f4t]] was approved.
* 19:03 "daniel24ps" was rejected (pending since 2024-12-18T19:00:21.249Z).
* 14:18 [[gitlab:beepbooppenguin|@beepbooppenguin]] was approved.
=== 2025-03-18 ===
* 17:48 "rahulkundu1209" was rejected (pending since 2024-12-17T17:46:41.936Z).
* 08:15 "kirtisikka972" was rejected (pending since 2024-12-17T08:13:25.487Z).
=== 2025-03-15 ===
* 13:30 "tulspal_sidhu" was rejected (pending since 2024-12-14T13:29:10.606Z).
* 01:39 "peacedeadc" was rejected (pending since 2024-12-14T01:37:36.579Z).
=== 2025-03-14 ===
* 03:51 [[gitlab:chuckthebuck|@chuckthebuck]] was approved.
* 02:33 "yxngtrtxll" was rejected (pending since 2024-12-13T02:31:51.658Z).
=== 2025-03-13 ===
* 14:36 [[gitlab:iccander|@iccander]] was approved.
=== 2025-03-12 ===
* 23:21 "jokerchic36" was rejected (pending since 2024-12-11T23:21:00.670Z).
* 15:30 [[gitlab:naomi|@naomi]] was approved.
* 15:27 [[gitlab:cobi|@cobi]] was approved.
=== 2025-03-11 ===
* 12:42 "mohitvermaxx" was rejected (pending since 2024-12-10T12:40:56.967Z).
=== 2025-03-10 ===
* 16:51 [[gitlab:nanona15dobato|@nanona15dobato]] was approved.
=== 2025-03-09 ===
* 22:39 [[gitlab:jonkolbert|@jonkolbert]] was approved.
* 20:45 [[gitlab:urbanecmtest2|@urbanecmtest2]] was approved.
=== 2025-03-07 ===
* 16:54 [[gitlab:hswan|@hswan]] was approved.
* 14:42 [[gitlab:atitkov|@atitkov]] was approved.
* 00:42 [[gitlab:infrastruktur|@infrastruktur]] was approved.
=== 2025-03-06 ===
* 17:21 "johnmann" was rejected (pending since 2024-12-05T17:19:24.995Z).
=== 2025-03-05 ===
* 07:33 [[gitlab:monx9494|@monx9494]] was approved.
=== 2025-03-02 ===
* 21:21 "paul26" was rejected (pending since 2024-12-01T21:20:19.681Z).
=== 2025-03-01 ===
* 19:15 [[gitlab:izno|@izno]] was approved.
* 12:45 [[gitlab:nyerho|@nyerho]] was approved.
=== 2025-02-28 ===
* 18:27 [[gitlab:chuckonwumelu|@chuckonwumelu]] was approved.
* 13:09 "ashwinpraveengo" was rejected (pending since 2024-11-29T13:07:47.240Z).
* 00:18 "eduardoaugusto" was rejected (pending since 2024-11-29T00:17:43.372Z).
=== 2025-02-27 ===
* 20:39 "volkanurl" was rejected (pending since 2024-11-28T20:37:18.101Z).
=== 2025-02-24 ===
* 21:15 [[gitlab:feeglgeef|@feeglgeef]] was approved.
* 20:18 [[gitlab:piaanalysis2|@piaanalysis2]] was approved.
* 19:06 [[gitlab:dhardy|@dhardy]] was approved.
=== 2025-02-22 ===
* 19:27 [[gitlab:owuh|@owuh]] was approved.
=== 2025-02-19 ===
* 16:06 [[gitlab:artemkloko|@artemkloko]] was approved.
* 13:03 [[gitlab:jgafnea|@jgafnea]] was approved.
=== 2025-02-17 ===
* 16:33 [[gitlab:asmartkitten|@asmartkitten]] was approved.
=== 2025-02-16 ===
* 19:12 "gaurigupta21" was rejected (pending since 2024-11-17T19:11:07.416Z).
=== 2025-02-15 ===
* 01:18 [[gitlab:mediawiki-quickstart-ci|@mediawiki-quickstart-ci]] was approved.
=== 2025-02-14 ===
* 15:21 "nathanbnm" was rejected (pending since 2024-11-15T15:18:19.632Z).
=== 2025-02-13 ===
* 16:45 [[gitlab:priyanshuchahal|@priyanshuchahal]] was approved.
* 16:42 [[gitlab:ajhalili2006|@ajhalili2006]] was approved.
=== 2025-02-12 ===
* 23:21 "monkeypatch999" was rejected (pending since 2024-11-13T23:20:38.398Z).
* 06:36 [[gitlab:jainlakshita28|@jainlakshita28]] was approved.
=== 2025-02-11 ===
* 19:27 [[gitlab:matthewsm2|@matthewsm2]] was approved.
=== 2025-02-09 ===
* 16:15 "mohammed_abukhadra" was rejected (pending since 2024-11-10T16:15:18.361Z).
=== 2025-02-07 ===
* 21:33 "brennan" was rejected (pending since 2024-11-08T21:31:07.351Z).
=== 2025-02-06 ===
* 08:24 "mmta" was rejected (pending since 2024-11-07T08:22:36.724Z).
* 06:21 [[gitlab:bunnypranav|@bunnypranav]] was approved.
=== 2025-02-05 ===
* 22:39 "chrissteinchen" was rejected (pending since 2024-11-06T22:38:16.673Z).
=== 2025-02-03 ===
* 07:45 "edriiic" was rejected (pending since 2024-11-04T07:44:46.849Z).
* 01:12 "geppy" was rejected (pending since 2024-11-04T01:10:48.710Z).
=== 2025-02-02 ===
* 13:18 "funa-enpitu" was rejected (pending since 2024-11-03T13:15:46.065Z).
=== 2025-01-31 ===
* 23:42 "nfontes" was rejected (pending since 2024-11-01T23:39:41.755Z).
* 22:51 "sbronson" was rejected (pending since 2024-11-01T22:50:31.871Z).
* 00:42 [[gitlab:farid|@farid]] was approved.
=== 2025-01-27 ===
* 08:15 [[gitlab:eliza189|@eliza189]] was approved.
=== 2025-01-25 ===
* 09:51 [[gitlab:pamputt|@pamputt]] was approved.
=== 2025-01-23 ===
* 14:30 [[gitlab:lubianat|@lubianat]] was approved.
* 11:45 [[gitlab:bootsa|@bootsa]] was approved.
=== 2025-01-21 ===
* 05:09 "niko" was rejected (pending since 2024-07-21T16:10:01.377Z).
* 05:09 "thawizkid369777" was rejected (pending since 2024-07-18T17:42:44.493Z).
* 05:09 "sarthaksingh2" was rejected (pending since 2024-07-10T11:31:30.470Z).
* 05:09 "shriyakt" was rejected (pending since 2024-07-06T04:54:10.248Z).
* 05:09 "akshaya" was rejected (pending since 2024-07-06T04:04:51.488Z).
* 05:09 "alaka03aj" was rejected (pending since 2024-07-05T18:01:54.876Z).
* 05:09 "sulochanaviji-5049" was rejected (pending since 2024-07-01T05:58:00.427Z).
* 05:09 "nayanjnath" was rejected (pending since 2024-07-01T02:51:57.405Z).
* 05:09 "sd44" was rejected (pending since 2024-06-30T04:28:51.436Z).
* 05:09 "metavalent" was rejected (pending since 2024-06-29T01:37:14.210Z).
* 05:09 "wicloudx" was rejected (pending since 2024-06-28T11:51:23.335Z).
* 05:09 "debo" was rejected (pending since 2024-06-28T01:44:59.845Z).
* 05:09 "bwiki" was rejected (pending since 2024-06-23T14:15:38.032Z).
* 05:09 "toprak" was rejected (pending since 2024-06-23T11:35:50.819Z).
* 05:09 "iristeller" was rejected (pending since 2024-06-14T20:53:48.959Z).
* 05:09 "jcolvin" was rejected (pending since 2024-06-12T17:29:01.238Z).
* 05:09 "kalyan" was rejected (pending since 2024-06-07T07:52:46.993Z).
* 05:09 "bluecrystal" was rejected (pending since 2024-06-06T19:16:20.107Z).
* 05:09 "iftttrohit" was rejected (pending since 2024-06-04T12:08:50.818Z).
* 05:09 "pogpotato" was rejected (pending since 2024-06-03T17:58:21.684Z).
* 05:09 "cptlausebaer" was rejected (pending since 2024-05-31T18:53:27.692Z).
* 05:09 "hdevine825" was rejected (pending since 2024-05-31T17:04:18.279Z).
* 05:09 "anaghaa18" was rejected (pending since 2024-05-25T19:14:31.803Z).
* 05:09 "atharvanair04" was rejected (pending since 2024-05-25T14:24:52.825Z).
* 05:09 "anasvemmully" was rejected (pending since 2024-05-25T06:10:27.261Z).
* 05:09 "abhinavmohandas" was rejected (pending since 2024-05-25T06:05:24.825Z).
* 05:09 "kksurendran06" was rejected (pending since 2024-05-25T06:04:38.082Z).
* 05:09 "albertmarshall8896" was rejected (pending since 2024-05-23T09:32:05.462Z).
* 05:09 "akellison" was rejected (pending since 2024-05-17T02:07:24.229Z).
* 05:09 "mainowill" was rejected (pending since 2024-04-16T23:30:33.881Z).
* 05:09 "bzhqc" was rejected (pending since 2024-04-16T19:50:38.676Z).
* 05:09 "safan41" was rejected (pending since 2024-04-16T03:34:48.942Z).
* 05:09 "mgagat" was rejected (pending since 2024-04-16T03:21:51.764Z).
* 05:09 "okeamah" was rejected (pending since 2024-04-16T02:49:00.143Z).
* 05:09 "xuhao61" was rejected (pending since 2024-04-15T23:45:09.083Z).
* 04:47 "cybel" was rejected (pending since 2024-04-15T06:46:35.791Z).
=== 2025-01-20 ===
* 14:33 [[gitlab:your1|@your1]] was approved.
=== 2025-01-18 ===
* 10:09 [[gitlab:galrach600|@galrach600]] was approved.
* 02:51 [[gitlab:blankeclair|@blankeclair]] was approved.
=== 2025-01-17 ===
* 13:57 [[gitlab:dsantamaria|@dsantamaria]] was approved.
=== 2025-01-15 ===
* 17:12 [[gitlab:smartse|@smartse]] was approved.
=== 2025-01-14 ===
* 17:03 [[gitlab:naorleizer|@naorleizer]] was approved.
=== 2025-01-13 ===
* 02:45 [[gitlab:wolf20482|@wolf20482]] was approved.
=== 2025-01-12 ===
* 17:45 [[gitlab:tamzin|@tamzin]] was approved.
=== 2025-01-11 ===
* 15:24 [[gitlab:bargioni|@bargioni]] was approved.
* 14:30 [[gitlab:salelya|@salelya]] was approved.
* 10:15 [[gitlab:malakatshy|@malakatshy]] was approved.
* 05:21 [[gitlab:newmcpee|@newmcpee]] was approved.
=== 2025-01-09 ===
* 15:30 [[gitlab:gkyziridis|@gkyziridis]] was approved.
=== 2025-01-08 ===
* 16:21 [[gitlab:ukrface|@ukrface]] was approved.
=== 2024-12-28 ===
* 03:27 [[gitlab:twonum|@twonum]] was approved.
=== 2024-12-25 ===
* 06:09 [[gitlab:harsv567|@harsv567]] was approved.
=== 2024-12-21 ===
* 11:24 [[gitlab:amutha2002|@amutha2002]] was approved.
=== 2024-12-20 ===
* 19:51 [[gitlab:hridyeshgupta|@hridyeshgupta]] was approved.
* 10:00 [[gitlab:ro-shines|@ro-shines]] was approved.
* 08:09 [[gitlab:kesharwaniarpita|@kesharwaniarpita]] was approved.
=== 2024-12-18 ===
* 14:45 [[gitlab:soylacarli|@soylacarli]] was approved.
=== 2024-12-16 ===
* 20:33 [[gitlab:aleyasiddika1|@aleyasiddika1]] was approved.
=== 2024-12-15 ===
* 07:33 [[gitlab:abhishek02bhardwaj|@abhishek02bhardwaj]] was approved.
=== 2024-12-13 ===
* 13:18 [[gitlab:ashmitabathre204|@ashmitabathre204]] was approved.
=== 2024-12-10 ===
* 06:39 [[gitlab:ginaan|@ginaan]] was approved.
=== 2024-12-09 ===
* 05:45 [[gitlab:kallinavya|@kallinavya]] was approved.
* 00:54 [[gitlab:viserion-7|@viserion-7]] was approved.
=== 2024-12-08 ===
* 17:27 [[gitlab:wargo|@wargo]] was approved.
=== 2024-12-05 ===
* 11:15 [[gitlab:ranjithraj|@ranjithraj]] was approved.
=== 2024-12-02 ===
* 21:21 [[gitlab:a930913|@a930913]] was approved.
=== 2024-12-01 ===
* 02:39 [[gitlab:kingchristlike1|@kingchristlike1]] was approved.
=== 2024-11-21 ===
* 13:45 [[gitlab:sascha|@sascha]] was approved.
=== 2024-11-19 ===
* 16:36 [[gitlab:jly|@jly]] was approved.
=== 2024-11-15 ===
* 02:54 [[gitlab:danielyepezgarces|@danielyepezgarces]] was approved.
=== 2024-11-14 ===
* 14:15 [[gitlab:stimoroll|@stimoroll]] was approved.
=== 2024-11-09 ===
* 17:15 [[gitlab:f4udeveloper|@f4udeveloper]] was approved.
=== 2024-11-07 ===
* 19:15 [[gitlab:zulf|@zulf]] was approved.
* 05:33 [[gitlab:hassanamin|@hassanamin]] was approved.
=== 2024-11-06 ===
* 19:39 [[gitlab:daniuu|@daniuu]] was approved.
* 00:18 [[gitlab:rlopez-wmf|@rlopez-wmf]] was approved.
=== 2024-10-09 ===
* 14:45 [[gitlab:jtweed|@jtweed]] was approved.
* 10:24 [[gitlab:ifrahkh|@ifrahkh]] was approved.
* 09:06 [[gitlab:wikibayer|@wikibayer]] was approved.
=== 2024-10-06 ===
* 10:27 [[gitlab:keerthan16|@keerthan16]] was approved.
=== 2024-10-04 ===
* 07:45 [[gitlab:hakimi97|@hakimi97]] was approved.
=== 2024-09-30 ===
* 07:39 [[gitlab:ninjastrikers|@ninjastrikers]] was approved.
=== 2024-09-28 ===
* 17:30 [[gitlab:webrunner95|@webrunner95]] was approved.
=== 2024-09-18 ===
* 21:39 [[gitlab:elliottetzkorn|@elliottetzkorn]] was approved.
=== 2024-09-14 ===
* 22:06 [[gitlab:humptydumpty|@humptydumpty]] was approved.
=== 2024-09-06 ===
* 08:48 [[gitlab:mickabarber|@mickabarber]] was approved.
=== 2024-08-27 ===
* 17:36 [[gitlab:edgars|@edgars]] was approved.
=== 2024-08-22 ===
* 09:18 [[gitlab:antonkokhwmde|@antonkokhwmde]] was approved.
=== 2024-08-14 ===
* 19:21 [[gitlab:jfk|@jfk]] was approved.
=== 2024-08-13 ===
* 17:57 [[gitlab:daxserver|@daxserver]] was approved.
=== 2024-08-11 ===
* 09:57 [[gitlab:pauliesnug|@pauliesnug]] was approved.
=== 2024-08-10 ===
* 08:42 [[gitlab:ashig|@ashig]] was approved.
=== 2024-08-09 ===
* 14:09 [[gitlab:masssly|@masssly]] was approved.
=== 2024-08-05 ===
* 22:15 [[gitlab:mrtortue|@mrtortue]] was approved.
=== 2024-08-02 ===
* 16:21 [[gitlab:dsantini|@dsantini]] was approved.
=== 2024-07-31 ===
* 11:54 [[gitlab:cptviraj|@cptviraj]] was approved.
=== 2024-07-30 ===
* 19:09 [[gitlab:iniquity|@iniquity]] was approved.
* 10:00 [[gitlab:collins|@collins]] was approved.
=== 2024-07-27 ===
* 15:57 [[gitlab:songnguxyz|@songnguxyz]] was approved.
=== 2024-07-25 ===
* 12:36 [[gitlab:mszabo|@mszabo]] was approved.
* 09:21 [[gitlab:agarwalmahima|@agarwalmahima]] was approved.
=== 2024-07-24 ===
* 08:05 [[gitlab:dragoniez|@dragoniez]] was approved.
=== 2024-07-23 ===
* 06:54 [[gitlab:mirji|@mirji]] was approved.
=== 2024-07-16 ===
* 10:00 [[gitlab:lakejason0|@lakejason0]] was approved.
=== 2024-07-12 ===
* 11:33 [[gitlab:cn|@cn]] was approved.
* 08:12 [[gitlab:unchampignon|@unchampignon]] was approved.
=== 2024-07-07 ===
* 17:12 [[gitlab:agamyasamuel|@agamyasamuel]] was approved.
* 05:24 [[gitlab:kuldeepburjbhalaike|@kuldeepburjbhalaike]] was approved.
=== 2024-07-06 ===
* 11:18 [[gitlab:dibya|@dibya]] was approved.
* 04:54 [[gitlab:sarthakparashar|@sarthakparashar]] was approved.
=== 2024-07-05 ===
* 18:15 [[gitlab:vanshikarathi|@vanshikarathi]] was approved.
=== 2024-07-02 ===
* 19:00 [[gitlab:ebrahim|@ebrahim]] was approved.
=== 2024-07-01 ===
* 20:12 [[gitlab:rockingpenny4|@rockingpenny4]] was approved.
* 18:15 [[gitlab:balajijagadesh|@balajijagadesh]] was approved.
=== 2024-06-30 ===
* 18:24 [[gitlab:hrideshmg|@hrideshmg]] was approved.
* 07:18 [[gitlab:chanakyakumardas|@chanakyakumardas]] was approved.
* 06:30 [[gitlab:rihaan180|@rihaan180]] was approved.
=== 2024-06-27 ===
* 17:36 [[gitlab:driedmueller|@driedmueller]] was approved.
=== 2024-06-19 ===
* 12:57 [[gitlab:audreypenven|@audreypenven]] was approved.
=== 2024-06-16 ===
* 01:18 [[gitlab:roysmith|@roysmith]] was approved.
=== 2024-06-08 ===
* 02:45 [[gitlab:jleedev|@jleedev]] was approved.
=== 2024-06-03 ===
* 13:57 [[gitlab:afeder|@afeder]] was approved.
=== 2024-06-01 ===
* 10:54 [[gitlab:florianschmitt|@florianschmitt]] was approved.
=== 2024-05-30 ===
* 16:42 [[gitlab:krlsca|@krlsca]] was approved.
=== 2024-05-28 ===
* 11:24 [[gitlab:rickijay|@rickijay]] was approved.
=== 2024-05-26 ===
* 11:18 [[gitlab:ranjithsiji|@ranjithsiji]] was approved.
=== 2024-05-25 ===
* 07:24 [[gitlab:jony|@jony]] was approved.
=== 2024-05-23 ===
* 08:45 [[gitlab:lepticed7|@lepticed7]] was approved.
=== 2024-05-22 ===
* 20:42 [[gitlab:echecs|@echecs]] was approved.
=== 2024-05-21 ===
* 13:33 [[gitlab:mbs|@mbs]] was approved.
=== 2024-05-19 ===
* 18:06 [[gitlab:ionenlaser|@ionenlaser]] was approved.
=== 2024-05-18 ===
* 23:36 [[gitlab:mdaniels5757|@mdaniels5757]] was approved.
=== 2024-05-17 ===
* 08:54 [[gitlab:grapedog|@grapedog]] was approved.
=== 2024-05-08 ===
* 19:42 [[gitlab:kelhurd|@kelhurd]] was approved.
* 19:06 [[gitlab:khurd|@khurd]] was approved.
=== 2024-05-06 ===
* 19:48 [[gitlab:j3j5|@j3j5]] was approved.
* 12:06 [[gitlab:tk-999|@tk-999]] was approved.
=== 2024-05-05 ===
* 22:09 [[gitlab:pppery|@pppery]] was approved.
* 20:33 [[gitlab:sakretsu|@sakretsu]] was approved.
* 12:12 [[gitlab:waterquark|@waterquark]] was approved.
=== 2024-05-04 ===
* 09:03 [[gitlab:multichill|@multichill]] was approved.
* 07:42 [[gitlab:abaris|@abaris]] was approved.
=== 2024-05-03 ===
* 14:57 [[gitlab:maurusian|@maurusian]] was approved.
=== 2024-04-24 ===
* 05:48 [[gitlab:wolfinux|@wolfinux]] was approved.
=== 2024-04-23 ===
* 15:48 [[gitlab:dreamrimmer|@dreamrimmer]] was approved.
=== 2024-04-21 ===
* 06:51 [[gitlab:alon|@alon]] was approved.
=== 2024-04-17 ===
* 23:33 [[gitlab:derenrich|@derenrich]] was approved.
=== 2024-04-16 ===
* 17:18 [[gitlab:valcio|@valcio]] was approved.
=== 2024-04-14 ===
* 16:51 [[gitlab:wikilucas00|@wikilucas00]] was approved.
=== 2024-04-06 ===
* 12:48 [[gitlab:theprotonade|@theprotonade]] was approved.
=== 2024-04-02 ===
* 07:30 [[gitlab:bohuizhang|@bohuizhang]] was approved.
=== 2024-03-30 ===
* 13:36 [[gitlab:lpintscher|@lpintscher]] was approved.
=== 2024-03-26 ===
* 17:09 [[gitlab:eenabulele|@eenabulele]] was approved.
=== 2024-03-25 ===
* 14:27 [[gitlab:tuukka|@tuukka]] was approved.
=== 2024-03-24 ===
* 12:24 [[gitlab:firefly|@firefly]] was approved.
=== 2024-03-21 ===
* 19:33 [[gitlab:universal-omega|@universal-omega]] was approved.
=== 2024-03-17 ===
* 10:36 [[gitlab:bisel91|@bisel91]] was approved.
=== 2024-03-16 ===
* 10:09 [[gitlab:delord|@delord]] was approved.
* 00:42 [[gitlab:athulvis1|@athulvis1]] was approved.
=== 2024-03-15 ===
* 19:06 [[gitlab:ignaciorodrguez|@ignaciorodrguez]] was approved.
* 08:30 [[gitlab:peachey88|@peachey88]] was approved.
* 06:51 [[gitlab:derick|@derick]] was approved.
=== 2024-03-12 ===
* 15:06 [[gitlab:xiaoxiao|@xiaoxiao]] was approved.
=== 2024-03-06 ===
* 13:21 [[gitlab:desianabae1|@desianabae1]] was approved.
=== 2024-03-05 ===
* 19:21 [[gitlab:ep1c|@ep1c]] was approved.
* 16:33 [[gitlab:jasmine|@jasmine]] was approved.
=== 2024-03-02 ===
* 06:42 [[gitlab:potsdamlamb|@potsdamlamb]] was approved.
=== 2024-02-29 ===
* 23:18 [[gitlab:arandomname123|@arandomname123]] was approved.
* 18:03 [[gitlab:baba|@baba]] was approved.
* 17:48 [[gitlab:yfdyh000|@yfdyh000]] was approved.
* 03:09 [[gitlab:sds|@sds]] was approved.
=== 2024-02-27 ===
* 23:33 [[gitlab:lofhi|@lofhi]] was approved.
=== 2024-02-15 ===
* 19:45 [[gitlab:gergesshamon|@gergesshamon]] was approved.
=== 2024-02-14 ===
* 14:33 [[gitlab:philipnelson99|@philipnelson99]] was approved.
=== 2024-02-13 ===
* 13:06 [[gitlab:dringsim|@dringsim]] was approved.
=== 2024-02-12 ===
* 17:36 [[gitlab:haak|@haak]] was approved.
=== 2024-02-05 ===
* 17:33 [[gitlab:qwerfjkl|@qwerfjkl]] was approved.
* 17:14 [[gitlab:ahecht|@ahecht]] was approved.
=== 2024-02-01 ===
* 09:27 [[gitlab:arinaigum|@arinaigum]] was approved.
* 00:15 [[gitlab:jas42|@jas42]] was approved.
* 00:15 [[gitlab:edhu|@edhu]] was approved.
* 00:15 [[gitlab:marnanel|@marnanel]] was approved.
* 00:15 [[gitlab:ibrahemqasim|@ibrahemqasim]] was approved.
* 00:15 [[gitlab:amasotti|@amasotti]] was approved.
* 00:15 [[gitlab:deni|@deni]] was approved.
* 00:15 [[gitlab:cyber|@cyber]] was approved.
* 00:15 [[gitlab:saroj|@saroj]] was approved.
=== 2024-01-29 ===
* 21:42 [[gitlab:rgupta|@rgupta]] was approved.
=== 2024-01-07 ===
* 09:48 [[gitlab:lutrome|@lutrome]] was approved.
=== 2024-01-05 ===
* 20:48 [[gitlab:jinoytommanjaly|@jinoytommanjaly]] was approved.
* 02:51 [[gitlab:braunobruno|@braunobruno]] was approved.
* 01:08 [[gitlab:amorymeltzer|@amorymeltzer]] was approved.
* 01:08 [[gitlab:phi22ipus|@phi22ipus]] was approved.
=== 2024-01-03 ===
* 14:45 [[gitlab:gabina|@gabina]] was approved.
=== 2024-01-02 ===
* 13:18 [[gitlab:arthurtaylor|@arthurtaylor]] was approved.
=== 2023-12-23 ===
* 00:33 [[gitlab:aram|@aram]] was approved.
=== 2023-12-22 ===
* 16:24 [[gitlab:elpitareio|@elpitareio]] was approved.
=== 2023-12-21 ===
* 00:43 [[gitlab:bsadowski1|@bsadowski1]] was approved.
* 00:43 [[gitlab:ederporto|@ederporto]] was approved.
* 00:43 [[gitlab:sadraiiali|@sadraiiali]] was approved.
* 00:43 [[gitlab:wasp-outis|@wasp-outis]] was approved.
* 00:43 [[gitlab:bodhisattwa|@bodhisattwa]] was approved.
* 00:43 [[gitlab:air7538|@air7538]] was approved.
* 00:43 [[gitlab:anzx|@anzx]] was approved.
* 00:43 [[gitlab:tekask1903|@tekask1903]] was approved.
* 00:42 [[gitlab:kiwi-0x010c|@kiwi-0x010c]] was approved.
* 00:42 [[gitlab:mpaa|@mpaa]] was approved.
* 00:42 [[gitlab:kutay|@kutay]] was approved.
* 00:42 [[gitlab:wattmto|@wattmto]] was approved.
2fgh0zaybrsevnh24p8tvvjh3jhst6g
News/2024 Migrating Wikitech Account to SUL
0
455262
2396610
2283572
2026-03-29T08:12:14Z
Minorax
38339
2396610
wikitext
text/x-wiki
{{Tracked|T161859|Resolved}}
'''wikitech.wikimedia.org wiki accounts''' have migrated from [[mw:Special:MyLanguage/Developer account|Developer accounts]] to [[meta:Special:MyLanguage/Help:Unified login|'''Wikimedia Unified Login (SUL/single user login)''']] accounts. This work unblocked wikitech.wikimedia.org from migrating to hosting in the [[Kubernetes/Clusters#WikiKube|"WikiKube" Kubernetes cluster]] that hosts Wikimedia movement project wikis.
Now that the process is complete:
* [[mw:Developer account|Developer accounts]] are managed via [[IDM|'''Wikimedia IDM''']] and not Wikitech.
* Users are able to login and edit Wikitech using their Wikimedia Unified Login (SUL). [[m:SUL|Read more about SUL in metawiki]].
== Timeline ==
* '''2024-10-01''': {{Done}} Wikitech detached from Developer accounts and configured so that local accounts can be attached to Wikimedia SUL accounts with matching usernames.
* <s>2024-11-30</s> '''2025-02-10''': {{Done}} All legacy local accounts that have been linked to a Wikimedia SUL account via idm.wikimedia.org or toolsadmin.wikimedia.org will be renamed if necessary and then attached to the SUL account.<br />Some renames and attachments may happen sooner as admins find the time to test and refine the process.
* <s>2025-01-30</s> '''2025-02-24''': {{Done}} Final migration. Wikitech config changed to only use Wikimedia SUL accounts. SUL account mappings added since the prior migration will be processed. All remaining unattached accounts will be converted to SUL accounts. This final conversion will include renaming any local accounts that match existing SUL accounts to avoid that name collision.
{{Note|text=Merging two MediaWiki accounts is not currently possible.}}
== What you needed to do ==
=== Wikitech is rejecting my password ===
You will have to request a password reset through [[Special:PasswordReset]] if you wish to edit Wikitech prior to the SUL unification step. This is necessary because Wikitech did not store Developer account passwords.
=== My Wikimedia Developer Account username matches my Wikimedia Unified Login (SUL) username ===
Great news! Please use the [[Special:MergeAccount]] option in Wikitech, and merge the two. You are done!
=== My Wikimedia Developer Account username is <u>different</u> from my Wikimedia Unified Login (SUL) username ===
* Please visit https://idm.wikimedia.org and follow the instructions provided to link your Developer account to your existing Wikimedia Global Account (SUL). Linked Wikitech accounts will be renamed on 2025-02-10 and 2025-02-24.
* Or go to [[/Rename requests|Rename requests]] and write down your username change request.
=== I have followed the instructions, but I am still having issues. ===
Trouble logging in? Report it on [[phab:T376267|task T376267]]
== What changed ==
One of the historic roles of wikitech.wikimedia.org has been the creation and management of [[mw:Special:MyLanguage/Developer account|Developer account]]s. In order to manage these accounts, Wikitech itself uses [[mw:Extension:LDAP Authentication|Extension:LDAP Authentication]] to connect local MediaWiki accounts with the Developer account data that we store in an LDAP directory. The LDAP Authentication extension itself has been largely unmaintained for many years. Other communities who used it have moved on to [[mw:Special:MyLanguage/LDAP Stack|replacement systems]], but Wikitech has not. This was largely due to the use of [[mw:Extension:OpenStackManager|Extension:OpenStackManager]] by Wikitech. Extension:OpenStackManager required Extension:LDAP Authentication to function.
After several years of work, Extension:OpenStackManager has been fully replaced by a combination of [[Help:Horizon FAQ|horizon.wikimedia.org]] and [[IDM|idm.wikimedia.org]]. This has unblocked a [[phab:T161859|long desired migration]] of wikitech.wikimedia.org from using Developer accounts to identify editors to using SUL accounts. Functionally everyone contributing to the Wikimedia movement has a SUL account, but only a small percentage of us have Developer accounts. We hope that the SUL migration will allow more Wikimedians to contribute to the technical documentation here on Wikitech. It will also remove a special dependency from the wiki which will make it easier for us to [[phab:T237773|move to a different hosting configuration]].
== Account migration details ==
; Developer accounts that are linked to SUL accounts
: If you link your Developer and SUL accounts by the deadline, edits made by your legacy Wikitech account will be migrated to your Wikimedia Global Account (SUL). This will retain your full edit history and attribution for past contributions in one place.
; Unlinked Developer accounts
:If you do not link your accounts by the deadline, your legacy Wikitech account will be migrated to a local placeholder account (for example <code>OLDWIKITECHUSERNAME~labswiki</code>). This is similar to what was done for unclaimed local accounts during the [[m:Single User Login finalisation announcement|SUL migration for content wikis]] in 2015. Your edits will still be visible in page history, but you will no longer be able to login to that account and new edits will be made under a new SUL account with no immediate link to your past edits.
If you have already linked your accounts, no further action is required at this time.
If you have any questions or need assistance, please do not hesitate to contact us at [[Portal:Toolforge/About Toolforge#Communication and support]]
We are also maintaining an FAQ page below that hopefully answers some questions you may have.
==FAQ==
=== Will my Wikitech user name change? ===
Possibly, yes:
* If your Wikitech account and your SUL account have matching names, for example [[User:Legoktm]] and [[meta:User:Legoktm]], then neither user name will change.
* If your Wikitech account and your SUL account have different names, for example [[User:BryanDavis]] and [[meta:User:BDavis (WMF)]], then your Wikitech account will be renamed to match your SUL account.
* Wikitech accounts that do not have a SUL account associated when the migration occurs will be renamed to ensure that they do not collide with present or future SUL accounts by appending something like <code>~labswiki</code> to the former account name. (<code>~</code> is a character not normally allowed in Wikimedia SUL account names.) These accounts will also become inaccessible by virtue of not having an associated password. It is currently unknown if these accounts will be recoverable in the future.
=== Will my Developer account user name or shell account name change? ===
No. Wikitech is being separated from [[mw:Developer account|Developer account]]s, but no other systemic changes to Developer accounts are planned at this time.
=== Will my logins to Cloud VPS, Toolforge, Horizon, Gerrit, GitLab, Phabricator change? ===
No. These systems all use [[mw:Developer account|Developer account]]s today and will continue to use Developer accounts in the foreseeable future.
Your Phabricator account may already be connected to your SUL account as Phabricator logins can use either, or even both!, systems. This will not be changing as part of the Wikitech SUL migration.
=== How will I manage the password, email address, ssh keys, etc of my Developer account after the SUL migration? ===
https://idm.wikimedia.org has already become the central location for [[mw:Developer account]] management. You can read more about this identity management service at [[IDM]].
=== When will the SUL migration happen? ===
See [[#Timeline]] for current estimated dates. Please be aware that these dates may change due to many factors.
== See also ==
* [[listarchive:list/wikitech-l@lists.wikimedia.org/message/SNUR67LXJM5YNEXZVMUNZGPCNRSFKI2V/|[Wikitech-l] Wikitech accounts Migration to Wikimedia Unified Login (SUL) and introduction of Wikimedia IDM for developer accounts]]
* [[listarchive:list/wikitech-l@lists.wikimedia.org/thread/GXMCSJWUNN4O7UMXEAFNSKUUCO7KRTLV/|[Wikitech-l] Updated timeline for Wikitech SUL finalization]]
* [[listarchive:list/wikitech-l@lists.wikimedia.org/thread/TDSBWLR4OHDBNNGOX4Q2XE7I534UREME/|[Wikitech-l] Wikitech SUL migration is complete!]]
d7w8mra3o3oupl7gnouq4g26l6q39cd